io.smartdatalake.workflow.action.customlogic
CustomDfTransformerConfig
Companion object CustomDfTransformerConfig
case class CustomDfTransformerConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, sqlCode: Option[String] = None, pythonFile: Option[String] = None, pythonCode: Option[String] = None, options: Option[Map[String, String]] = None, runtimeOptions: Option[Map[String, String]] = None) extends Product with Serializable
Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) Define a transform function which receives a DataObjectIds, a DataFrames and a map of options and has to return a DataFrame, see also CustomDfTransformer.
Note about Python transformation: Environment with Python and PySpark needed.
PySpark session is initialize and available under variables sc, session, sqlContext.
Other variables available are
- inputDf: Input DataFrame
- options: Transformation options as Map[String,String]
- dataObjectId: Id of input dataObject as String
Output DataFrame must be set with setOutputDf(df).
- className
Optional class name implementing trait CustomDfTransformer
- scalaFile
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
- scalaCode
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
- sqlCode
Optional SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
- pythonFile
Optional pythonFile to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
- pythonCode
Optional pythonCode to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
- Annotations
- @Scaladoc()
- Alphabetic
- By Inheritance
- CustomDfTransformerConfig
- Serializable
- Serializable
- Product
- Equals
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
CustomDfTransformerConfig(className: Option[String] = None, scalaFile: Option[String] = None, scalaCode: Option[String] = None, sqlCode: Option[String] = None, pythonFile: Option[String] = None, pythonCode: Option[String] = None, options: Option[Map[String, String]] = None, runtimeOptions: Option[Map[String, String]] = None)
- className
Optional class name implementing trait CustomDfTransformer
- scalaFile
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
- scalaCode
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
- sqlCode
Optional SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
- pythonFile
Optional pythonFile to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
- pythonCode
Optional pythonCode to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
- val className: Option[String]
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- val impl: DfTransformer
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- val options: Option[Map[String, String]]
- val pythonCode: Option[String]
- val pythonFile: Option[String]
- val runtimeOptions: Option[Map[String, String]]
- val scalaCode: Option[String]
- val scalaFile: Option[String]
- val sqlCode: Option[String]
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- CustomDfTransformerConfig → AnyRef → Any
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated
- Deprecated