package customlogic
- Alphabetic
- Public
- All
Type Members
-
trait
CustomDfCreator extends Serializable
Interface to define custom logic for DataFrame creation
Interface to define custom logic for DataFrame creation
- Annotations
- @Scaladoc()
-
case class
CustomDfCreatorConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, options: Option[Map[String, String]] = None) extends Product with Serializable
Configuration of a custom Spark-DataFrame creator as part of CustomDfDataObject Define a exec function which receives a map of options and returns a DataFrame to be used as input.
Configuration of a custom Spark-DataFrame creator as part of CustomDfDataObject Define a exec function which receives a map of options and returns a DataFrame to be used as input. Optionally define a schema function to return a StructType used as schema in init-phase. See also trait CustomDfCreator.
Note that for now implementing CustomDfCreator.schema method is only possible with className configuration attribute.
- className
Optional class name implementing trait CustomDfCreator
- scalaFile
Optional file where scala code for creator is loaded from. The scala code in the file needs to be a function of type fnExecType.
- scalaCode
Optional scala code for creator. The scala code needs to be a function of type fnExecType.
- options
Options to pass to the creator
- Annotations
- @Scaladoc()
- class CustomDfCreatorWrapper extends CustomDfCreator
-
trait
CustomDfTransformer extends Serializable
Interface to define a custom Spark-DataFrame transformation (1:1)
Interface to define a custom Spark-DataFrame transformation (1:1)
- Annotations
- @Scaladoc()
-
case class
CustomDfTransformerConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, sqlCode: Option[String] = None, pythonFile: Option[String] = None, pythonCode: Option[String] = None, options: Option[Map[String, String]] = None, runtimeOptions: Option[Map[String, String]] = None) extends Product with Serializable
Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) Define a transform function which receives a DataObjectIds, a DataFrames and a map of options and has to return a DataFrame, see also CustomDfTransformer.
Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) Define a transform function which receives a DataObjectIds, a DataFrames and a map of options and has to return a DataFrame, see also CustomDfTransformer.
Note about Python transformation: Environment with Python and PySpark needed. PySpark session is initialize and available under variables
sc,session,sqlContext. Other variables available are -inputDf: Input DataFrame -options: Transformation options as Map[String,String] -dataObjectId: Id of input dataObject as String Output DataFrame must be set withsetOutputDf(df).- className
Optional class name implementing trait CustomDfTransformer
- scalaFile
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
- scalaCode
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
- sqlCode
Optional SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
- pythonFile
Optional pythonFile to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
- pythonCode
Optional pythonCode to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
- Annotations
- @Scaladoc()
-
trait
CustomDfsTransformer extends Serializable
Interface to define a custom Spark-DataFrame transformation (n:m) Same trait as CustomDfTransformer, but multiple input and outputs supported.
Interface to define a custom Spark-DataFrame transformation (n:m) Same trait as CustomDfTransformer, but multiple input and outputs supported.
- Annotations
- @Scaladoc()
-
case class
CustomDfsTransformerConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, sqlCode: Option[Map[DataObjectId, String]] = None, options: Option[Map[String, String]] = None, runtimeOptions: Option[Map[String, String]] = None) extends Product with Serializable
Configuration of a custom Spark-DataFrame transformation between many inputs and many outputs (n:m).
Configuration of a custom Spark-DataFrame transformation between many inputs and many outputs (n:m). Define a transform function which receives a map of input DataObjectIds with DataFrames and a map of options and has to return a map of output DataObjectIds with DataFrames, see also trait CustomDfsTransformer.
- className
Optional class name implementing trait CustomDfsTransformer
- scalaFile
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
- scalaCode
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
- sqlCode
Optional map of DataObjectId and corresponding SQL Code. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
- Annotations
- @Scaladoc()
- trait CustomFileCreator extends Serializable
- case class CustomFileCreatorConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, options: Option[Map[String, String]] = None) extends Product with Serializable
- class CustomFileCreatorWrapper extends CustomFileCreator
-
trait
CustomFileTransformer extends Serializable
Interface to define custom file transformation for CustomFileAction
Interface to define custom file transformation for CustomFileAction
- Annotations
- @Scaladoc()
-
case class
CustomFileTransformerConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, options: Option[Map[String, String]] = None) extends Product with Serializable
Configuration of custom file transformation between one input and one output (1:1)
Configuration of custom file transformation between one input and one output (1:1)
- className
Optional class name to load transformer code from
- scalaFile
Optional file where scala code for transformation is loaded from
- scalaCode
Optional scala code for transformation
- options
Options to pass to the transformation
- Annotations
- @Scaladoc()
- class CustomFileTransformerWrapper extends CustomFileTransformer with SmartDataLakeLogger
-
case class
PythonUDFCreatorConfig(pythonFile: Option[String] = None, pythonCode: Option[String] = None, options: Option[Map[String, String]] = None) extends Product with Serializable
Configuration to register a Python UDF in the spark session of SmartDataLake.
Configuration to register a Python UDF in the spark session of SmartDataLake. Define a python function with type hints i python code and register it in global configuration. The name of the function must match the name you use to declare it in GlobalConf. The Python function can then be used in Spark SQL expressions.
- pythonFile
Optional pythonFile to use for python UDF.
- pythonCode
Optional pythonCode to use for python UDF.
- options
Options are available in your python code as variable options.
- Annotations
- @Scaladoc()
-
trait
SparkUDFCreator extends Serializable
Interface to create a UserDefinedFunction object to be registered as udf.
Interface to create a UserDefinedFunction object to be registered as udf.
- Annotations
- @Scaladoc()
-
case class
SparkUDFCreatorConfig(className: String, options: Option[Map[String, String]] = None) extends Product with Serializable
Configuration to register a UserDefinedFunction in the spark session of SmartDataLake.
Configuration to register a UserDefinedFunction in the spark session of SmartDataLake.
- className
fully qualified class name of class implementing SparkUDFCreator interface. The class needs a constructor without parameters.
- options
Options are passed to SparkUDFCreator apply method.
- Annotations
- @Scaladoc()
Value Members
- object CustomDfCreatorConfig extends Serializable
- object CustomDfTransformerConfig extends Serializable
- object CustomDfsTransformerConfig extends Serializable