Packages

package config

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. case class Configuration(storages: List[StorageConf] = List(), sources: List[DatasetConf] = List(), args: List[String] = List.empty[String], sparkconf: Map[String, String] = Map()) extends Product with Serializable

    Base configuration needed for an ETL job

    Base configuration needed for an ETL job

    storages

    list of storages associated with aliases

    sources

    list of data sources

    args

    arguments passed to the job

    sparkconf

    extra configuration for the spark conf

  2. case class DatasetConf(id: String, storageid: String, path: String, format: Format, loadtype: LoadType, table: Option[TableConf] = None, keys: List[String] = List(), partitionby: List[String] = List(), readoptions: Map[String, String] = Map(), writeoptions: Map[String, String] = WriteOptions.DEFAULT_OPTIONS, documentationpath: Option[String] = None, view: Option[TableConf] = None) extends Product with Serializable

    Abstraction on a dataset configuration

    Abstraction on a dataset configuration

    storageid

    an alias designating where the data is sitting. this can point to an object store url in the configuration like s3://my-bucket/

    path

    the relative path from the root of the storage to the dataset. ie, /raw/my-system/my-source

    format

    data format

    loadtype

    how the data is written

    table

    OPTIONAL - configuration of a table associated to the dataset

    readoptions

    OPTIONAL - read options to pass to spark in order to read the data into a DataFrame

    writeoptions

    OPTIONAL - write options to pass to spark in order to write the data into files

    documentationpath

    OPTIONAL - where the documentation is located.

    view

    OPTIONAL - schema of the view pointing to the concrete table

  3. case class StorageConf(id: String, path: String) extends Product with Serializable

    Configuration of a storage endpoint

    Configuration of a storage endpoint

    id

    unique identifier to the storage. should match alias given to a DatasetConf

    path

    path to the storage

  4. case class TableConf(database: String, name: String) extends Product with Serializable

    Configuration for a table

    Configuration for a table

    database

    name of the database / schema

    name

    name of the table / view

Value Members

  1. object Configuration extends Serializable
  2. object ConfigurationLoader
  3. object ConfigurationWriter
  4. object DatasetConf extends Serializable

Ungrouped