io.smartdatalake.workflow.action.sparktransformer
SQLDfTransformer
Companion object SQLDfTransformer
case class SQLDfTransformer(name: String = "sqlTransform", description: Option[String] = None, code: String, options: Map[String, String] = Map(), runtimeOptions: Map[String, String] = Map()) extends OptionsDfTransformer with Product with Serializable
Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) as SQL code. The input data is available as temporary view in SQL. As name for the temporary view the input DataObjectId is used (special characters are replaces by underscores). A special token '%{inputViewName}' will be replaced with the name of the temporary view at runtime.
- name
name of the transformer
- description
Optional description of the transformer
- code
SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}" A special token %{inputViewName} can be used to insert the temporary view name.
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
- Annotations
- @Scaladoc()
- Alphabetic
- By Inheritance
- SQLDfTransformer
- Serializable
- Serializable
- Product
- Equals
- OptionsDfTransformer
- ParsableDfTransformer
- ParsableFromConfig
- DfTransformer
- PartitionValueTransformer
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
SQLDfTransformer(name: String = "sqlTransform", description: Option[String] = None, code: String, options: Map[String, String] = Map(), runtimeOptions: Map[String, String] = Map())
- name
name of the transformer
- description
Optional description of the transformer
- code
SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}" A special token %{inputViewName} can be used to insert the temporary view name.
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
- val code: String
-
val
description: Option[String]
- Definition Classes
- SQLDfTransformer → DfTransformer
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
factory: FromConfigFactory[ParsableDfTransformer]
Returns the factory that can parse this type (that is, type
CO).Returns the factory that can parse this type (that is, type
CO).Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
- returns
the factory (object) for this class.
- Definition Classes
- SQLDfTransformer → ParsableFromConfig
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
name: String
- Definition Classes
- SQLDfTransformer → DfTransformer
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
val
options: Map[String, String]
- Definition Classes
- SQLDfTransformer → OptionsDfTransformer
-
def
prepare(actionId: ActionId)(implicit context: ActionPipelineContext): Unit
Optional function to implement validations in prepare phase.
Optional function to implement validations in prepare phase.
- Definition Classes
- DfTransformer
- Annotations
- @Scaladoc()
-
val
runtimeOptions: Map[String, String]
- Definition Classes
- SQLDfTransformer → OptionsDfTransformer
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
transform(actionId: ActionId, partitionValues: Seq[PartitionValues], df: DataFrame, dataObjectId: DataObjectId)(implicit context: ActionPipelineContext): DataFrame
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
- Definition Classes
- OptionsDfTransformer → DfTransformer
-
def
transformPartitionValues(actionId: ActionId, partitionValues: Seq[PartitionValues])(implicit context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
- actionId
id of the action which executes this transformation. This is mainly used to prefix error messages.
- partitionValues
partition values to transform
- returns
Map of input to output partition values. This allows to map partition values forward and backward, which is needed in execution modes. Return None if mapping is 1:1.
- Definition Classes
- OptionsDfTransformer → PartitionValueTransformer
-
def
transformPartitionValuesWithOptions(actionId: ActionId, partitionValues: Seq[PartitionValues], options: Map[String, String])(implicit context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
- options
Options specified in the configuration for this transformation, including evaluated runtimeOptions
- Definition Classes
- OptionsDfTransformer
- Annotations
- @Scaladoc()
-
def
transformWithOptions(actionId: ActionId, partitionValues: Seq[PartitionValues], df: DataFrame, dataObjectId: DataObjectId, options: Map[String, String])(implicit context: ActionPipelineContext): DataFrame
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
- options
Options specified in the configuration for this transformation, including evaluated runtimeOptions
- Definition Classes
- SQLDfTransformer → OptionsDfTransformer
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated
- Deprecated