case class PythonCodeDfTransformer(name: String = "pythonTransform", description: Option[String] = None, code: Option[String] = None, file: Option[String] = None, options: Map[String, String] = Map(), runtimeOptions: Map[String, String] = Map()) extends OptionsDfTransformer with Product with Serializable

Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) as Python/PySpark code. Note that this transformer needs a Python and PySpark environment installed. PySpark session is initialize and available under variables sc, session, sqlContext. Other variables available are - inputDf: Input DataFrame - options: Transformation options as Map[String,String] - dataObjectId: Id of input dataObject as String Output DataFrame must be set with setOutputDf(df).

name

name of the transformer

description

Optional description of the transformer

code

Optional python code to user for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.

file

Optional file with python code to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.

options

Options to pass to the transformation

runtimeOptions

optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.

Annotations
@Scaladoc()
Linear Supertypes
Serializable, Serializable, Product, Equals, OptionsDfTransformer, ParsableDfTransformer, ParsableFromConfig[ParsableDfTransformer], DfTransformer, PartitionValueTransformer, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. PythonCodeDfTransformer
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. OptionsDfTransformer
  7. ParsableDfTransformer
  8. ParsableFromConfig
  9. DfTransformer
  10. PartitionValueTransformer
  11. AnyRef
  12. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new PythonCodeDfTransformer(name: String = "pythonTransform", description: Option[String] = None, code: Option[String] = None, file: Option[String] = None, options: Map[String, String] = Map(), runtimeOptions: Map[String, String] = Map())

    name

    name of the transformer

    description

    Optional description of the transformer

    code

    Optional python code to user for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.

    file

    Optional file with python code to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.

    options

    Options to pass to the transformation

    runtimeOptions

    optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  6. val code: Option[String]
  7. val description: Option[String]
    Definition Classes
    PythonCodeDfTransformerDfTransformer
  8. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  9. def factory: FromConfigFactory[ParsableDfTransformer]

    Returns the factory that can parse this type (that is, type CO).

    Returns the factory that can parse this type (that is, type CO).

    Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.

    returns

    the factory (object) for this class.

    Definition Classes
    PythonCodeDfTransformer → ParsableFromConfig
  10. val file: Option[String]
  11. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  12. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  13. val name: String
    Definition Classes
    PythonCodeDfTransformerDfTransformer
  14. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  15. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  16. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  17. val options: Map[String, String]
  18. def prepare(actionId: ActionId)(implicit context: ActionPipelineContext): Unit

    Optional function to implement validations in prepare phase.

    Optional function to implement validations in prepare phase.

    Definition Classes
    DfTransformer
    Annotations
    @Scaladoc()
  19. val runtimeOptions: Map[String, String]
  20. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  21. def transform(actionId: ActionId, partitionValues: Seq[PartitionValues], df: DataFrame, dataObjectId: DataObjectId)(implicit context: ActionPipelineContext): DataFrame

    Function to be implemented to define the transformation between an input and output DataFrame (1:1)

    Function to be implemented to define the transformation between an input and output DataFrame (1:1)

    Definition Classes
    OptionsDfTransformerDfTransformer
  22. def transformPartitionValues(actionId: ActionId, partitionValues: Seq[PartitionValues])(implicit context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]

    Optional function to define the transformation of input to output partition values.

    Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.

    actionId

    id of the action which executes this transformation. This is mainly used to prefix error messages.

    partitionValues

    partition values to transform

    returns

    Map of input to output partition values. This allows to map partition values forward and backward, which is needed in execution modes. Return None if mapping is 1:1.

    Definition Classes
    OptionsDfTransformerPartitionValueTransformer
  23. def transformPartitionValuesWithOptions(actionId: ActionId, partitionValues: Seq[PartitionValues], options: Map[String, String])(implicit context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]

    Optional function to define the transformation of input to output partition values.

    Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.

    options

    Options specified in the configuration for this transformation, including evaluated runtimeOptions

    Definition Classes
    OptionsDfTransformer
    Annotations
    @Scaladoc()
  24. def transformWithOptions(actionId: ActionId, partitionValues: Seq[PartitionValues], df: DataFrame, dataObjectId: DataObjectId, options: Map[String, String])(implicit context: ActionPipelineContext): DataFrame

    Function to be implemented to define the transformation between an input and output DataFrame (1:1)

    Function to be implemented to define the transformation between an input and output DataFrame (1:1)

    options

    Options specified in the configuration for this transformation, including evaluated runtimeOptions

    Definition Classes
    PythonCodeDfTransformerOptionsDfTransformer
  25. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  27. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from OptionsDfTransformer

Inherited from ParsableDfTransformer

Inherited from ParsableFromConfig[ParsableDfTransformer]

Inherited from DfTransformer

Inherited from PartitionValueTransformer

Inherited from AnyRef

Inherited from Any

Ungrouped