io.smartdatalake.workflow.action.sparktransformer
ScalaClassDfsTransformer
Companion object ScalaClassDfsTransformer
case class ScalaClassDfsTransformer(name: String = "scalaTransform", description: Option[String] = None, className: String, options: Map[String, String] = Map(), runtimeOptions: Map[String, String] = Map()) extends OptionsDfsTransformer with Product with Serializable
Configuration of a custom Spark-DataFrame transformation between many inputs and many outputs (n:m) Define a transform function which receives a map of input DataObjectIds with DataFrames and a map of options and as to return a map of output DataObjectIds with DataFrames, see also trait CustomDfsTransformer.
- name
name of the transformer
- description
Optional description of the transformer
- className
class name implementing trait CustomDfsTransformer
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
- Annotations
- @Scaladoc()
- Alphabetic
- By Inheritance
- ScalaClassDfsTransformer
- Serializable
- Serializable
- Product
- Equals
- OptionsDfsTransformer
- ParsableDfsTransformer
- ParsableFromConfig
- DfsTransformer
- PartitionValueTransformer
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
ScalaClassDfsTransformer(name: String = "scalaTransform", description: Option[String] = None, className: String, options: Map[String, String] = Map(), runtimeOptions: Map[String, String] = Map())
- name
name of the transformer
- description
Optional description of the transformer
- className
class name implementing trait CustomDfsTransformer
- options
Options to pass to the transformation
- runtimeOptions
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
- val className: String
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
-
val
description: Option[String]
- Definition Classes
- ScalaClassDfsTransformer → DfsTransformer
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
factory: FromConfigFactory[ParsableDfsTransformer]
Returns the factory that can parse this type (that is, type
CO).Returns the factory that can parse this type (that is, type
CO).Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
- returns
the factory (object) for this class.
- Definition Classes
- ScalaClassDfsTransformer → ParsableFromConfig
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
name: String
- Definition Classes
- ScalaClassDfsTransformer → DfsTransformer
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
val
options: Map[String, String]
- Definition Classes
- ScalaClassDfsTransformer → OptionsDfsTransformer
-
def
prepare(actionId: ActionId)(implicit context: ActionPipelineContext): Unit
Optional function to implement validations in prepare phase.
Optional function to implement validations in prepare phase.
- Definition Classes
- DfsTransformer
- Annotations
- @Scaladoc()
-
val
runtimeOptions: Map[String, String]
- Definition Classes
- ScalaClassDfsTransformer → OptionsDfsTransformer
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
transform(actionId: ActionId, partitionValues: Seq[PartitionValues], dfs: Map[String, DataFrame])(implicit context: ActionPipelineContext): Map[String, DataFrame]
Function to be implemented to define the transformation between many inputs and many outputs (n:m)
Function to be implemented to define the transformation between many inputs and many outputs (n:m)
- actionId
id of the action which executes this transformation. This is mainly used to prefix error messages.
- partitionValues
partition values to transform
- dfs
Map of (dataObjectId, DataFrame) tuples available as input
- returns
Map of transformed (dataObjectId, DataFrame) tuples
- Definition Classes
- OptionsDfsTransformer → DfsTransformer
-
def
transformPartitionValues(actionId: ActionId, partitionValues: Seq[PartitionValues])(implicit context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
- actionId
id of the action which executes this transformation. This is mainly used to prefix error messages.
- partitionValues
partition values to transform
- returns
Map of input to output partition values. This allows to map partition values forward and backward, which is needed in execution modes. Return None if mapping is 1:1.
- Definition Classes
- OptionsDfsTransformer → PartitionValueTransformer
-
def
transformPartitionValuesWithOptions(actionId: ActionId, partitionValues: Seq[PartitionValues], options: Map[String, String])(implicit context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases. see also DfsTransformer.transformPartitionValues()
- options
Options specified in the configuration for this transformation, including evaluated runtimeOptions
- Definition Classes
- ScalaClassDfsTransformer → OptionsDfsTransformer
-
def
transformWithOptions(actionId: ActionId, partitionValues: Seq[PartitionValues], dfs: Map[String, DataFrame], options: Map[String, String])(implicit context: ActionPipelineContext): Map[String, DataFrame]
Function to be implemented to define the transformation between many inputs and many outputs (n:m) see also DfsTransformer.transform()
Function to be implemented to define the transformation between many inputs and many outputs (n:m) see also DfsTransformer.transform()
- options
Options specified in the configuration for this transformation, including evaluated runtimeOptions
- Definition Classes
- ScalaClassDfsTransformer → OptionsDfsTransformer
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated
- Deprecated