package config
- Alphabetic
- Public
- All
Type Members
-
case class
Configuration(storages: List[StorageConf] = List(), sources: List[DatasetConf] = List(), args: List[String] = List.empty[String], sparkconf: Map[String, String] = Map()) extends Product with Serializable
Base configuration needed for an ETL job
Base configuration needed for an ETL job
- storages
list of storages associated with aliases
- sources
list of data sources
- args
arguments passed to the job
- sparkconf
extra configuration for the spark conf
-
case class
DatasetConf(id: String, storageid: String, path: String, format: Format, loadtype: LoadType, table: Option[TableConf] = None, keys: List[String] = List(), partitionby: List[String] = List(), readoptions: Map[String, String] = Map(), writeoptions: Map[String, String] = WriteOptions.DEFAULT_OPTIONS, documentationpath: Option[String] = None, view: Option[TableConf] = None) extends Product with Serializable
Abstraction on a dataset configuration
Abstraction on a dataset configuration
- storageid
an alias designating where the data is sitting. this can point to an object store url in the configuration like s3://my-bucket/
- path
the relative path from the root of the storage to the dataset. ie, /raw/my-system/my-source
- format
data format
- loadtype
how the data is written
- table
OPTIONAL - configuration of a table associated to the dataset
- readoptions
OPTIONAL - read options to pass to spark in order to read the data into a DataFrame
- writeoptions
OPTIONAL - write options to pass to spark in order to write the data into files
- documentationpath
OPTIONAL - where the documentation is located.
- view
OPTIONAL - schema of the view pointing to the concrete table
-
case class
StorageConf(id: String, path: String) extends Product with Serializable
Configuration of a storage endpoint
Configuration of a storage endpoint
- id
unique identifier to the storage. should match alias given to a DatasetConf
- path
path to the storage
-
case class
TableConf(database: String, name: String) extends Product with Serializable
Configuration for a table
Configuration for a table
- database
name of the database / schema
- name
name of the table / view
Value Members
- object Configuration extends Serializable
- object ConfigurationLoader
- object ConfigurationWriter
- object DatasetConf extends Serializable