abstract class ActionSubFeedsImpl[S <: SubFeed] extends Action
Implementation of SubFeed handling. This is a generic implementation that supports many input and output SubFeeds.
- S
SubFeed type this Action is designed for.
- Annotations
- @Scaladoc()
- Alphabetic
- By Inheritance
- ActionSubFeedsImpl
- Action
- AtlasExportable
- SmartDataLakeLogger
- DAGNode
- ParsableFromConfig
- SdlConfigObject
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new ActionSubFeedsImpl()(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[S])
Abstract Value Members
-
abstract
def
executionCondition: Option[Condition]
execution condition for this action.
execution condition for this action.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
abstract
def
executionMode: Option[ExecutionMode]
execution mode for this action.
execution mode for this action.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
abstract
def
factory: FromConfigFactory[Action]
Returns the factory that can parse this type (that is, type
CO).Returns the factory that can parse this type (that is, type
CO).Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
- returns
the factory (object) for this class.
- Definition Classes
- ParsableFromConfig
- Annotations
- @Scaladoc()
-
abstract
val
id: ActionId
A unique identifier for this instance.
A unique identifier for this instance.
- Definition Classes
- Action → SdlConfigObject
- Annotations
- @Scaladoc()
-
abstract
def
inputs: Seq[DataObject]
Input DataObjects To be implemented by subclasses
Input DataObjects To be implemented by subclasses
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
abstract
def
metadata: Option[ActionMetadata]
Additional metadata for the Action
Additional metadata for the Action
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
abstract
def
metricsFailCondition: Option[String]
Spark SQL condition evaluated as where-clause against dataframe of metrics.
Spark SQL condition evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
abstract
def
outputs: Seq[DataObject]
Output DataObjects To be implemented by subclasses
Output DataObjects To be implemented by subclasses
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
abstract
def
transform(inputSubFeeds: Seq[S], outputSubFeeds: Seq[S])(implicit context: ActionPipelineContext): Seq[S]
Transform subfeed content To be implemented by subclass.
Transform subfeed content To be implemented by subclass.
- Attributes
- protected
- Annotations
- @Scaladoc()
-
abstract
def
writeSubFeed(subFeed: S, isRecursive: Boolean)(implicit context: ActionPipelineContext): WriteSubFeedResult
Write subfeed data to output.
Write subfeed data to output. To be implemented by subclass.
- isRecursive
If subfeed is recursive (input & output)
- returns
false if there was no data to process, otherwise true.
- Attributes
- protected
- Annotations
- @Scaladoc()
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
addRuntimeEvent(executionId: ExecutionId, phase: ExecutionPhase, state: RuntimeEventState, msg: Option[String] = None, results: Seq[SubFeed] = Seq(), tstmp: LocalDateTime = LocalDateTime.now): Unit
Adds a runtime event for this Action
Adds a runtime event for this Action
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
addRuntimeMetrics(executionId: Option[ExecutionId], dataObjectId: Option[DataObjectId], metric: ActionMetrics): Unit
Adds a runtime metric for this Action
Adds a runtime metric for this Action
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
applyExecutionMode(mainInput: DataObject, mainOutput: DataObject, subFeed: SubFeed, partitionValuesTransform: (Seq[PartitionValues]) ⇒ Map[PartitionValues, PartitionValues])(implicit context: ActionPipelineContext): Unit
Applies the executionMode and stores result in executionModeResult variable
Applies the executionMode and stores result in executionModeResult variable
- Attributes
- protected
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
atlasName: String
- Definition Classes
- Action → AtlasExportable
-
def
atlasQualifiedName(prefix: String): String
- Definition Classes
- AtlasExportable
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
exec(subFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): Seq[SubFeed]
Executes the main task of an action.
Executes the main task of an action. In this step the data of the SubFeed's is moved from Input- to Output-DataObjects.
- subFeeds
SparkSubFeed's to be processed
- returns
processed SparkSubFeed's
- Definition Classes
- ActionSubFeedsImpl → Action
-
val
executionConditionResult: Option[(Boolean, Option[String])]
- Attributes
- protected
- Definition Classes
- Action
-
val
executionModeResult: Option[Try[Option[ExecutionModeResult]]]
- Attributes
- protected
- Definition Classes
- Action
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
getDataObjectsState: Seq[DataObjectState]
Get potential state of input DataObjects when executionMode is DataObjectStateIncrementalMode.
Get potential state of input DataObjects when executionMode is DataObjectStateIncrementalMode.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
getInputDataObject[T <: DataObject](id: DataObjectId)(implicit arg0: ClassTag[T], arg1: scala.reflect.api.JavaUniverse.TypeTag[T], registry: InstanceRegistry): T
- Attributes
- protected
- Definition Classes
- Action
-
def
getLatestRuntimeEventState: Option[RuntimeEventState]
Get latest runtime state
Get latest runtime state
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
getMainInput(inputSubFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): DataObject
- Attributes
- protected
-
def
getMainPartitionValues(inputSubFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): Seq[PartitionValues]
- Attributes
- protected
-
def
getOutputDataObject[T <: DataObject](id: DataObjectId)(implicit arg0: ClassTag[T], arg1: scala.reflect.api.JavaUniverse.TypeTag[T], registry: InstanceRegistry): T
- Attributes
- protected
- Definition Classes
- Action
-
def
getRuntimeDataImpl: RuntimeData
- Attributes
- protected
- Definition Classes
- Action
-
def
getRuntimeInfo(executionId: Option[ExecutionId] = None): Option[RuntimeInfo]
Get summarized runtime information for a given ExecutionId.
Get summarized runtime information for a given ExecutionId.
- executionId
ExecutionId to get runtime information for. If empty runtime information for last ExecutionId are returned.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
getRuntimeMetrics(executionId: Option[ExecutionId] = None): Map[DataObjectId, Option[ActionMetrics]]
Get the latest metrics for all DataObjects and a given SDLExecutionId.
Get the latest metrics for all DataObjects and a given SDLExecutionId.
- executionId
ExecutionId to get metrics for. If empty metrics for last ExecutionId are returned.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
init(subFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): Seq[SubFeed]
Initialize Action with SubFeed's to be processed.
Initialize Action with SubFeed's to be processed. In this step the execution mode is evaluated and the result stored for the exec phase. If successful - the DAG can be built - Spark DataFrame lineage can be built
- subFeeds
SparkSubFeed's to be processed
- returns
processed SparkSubFeed's
- Definition Classes
- ActionSubFeedsImpl → Action
- def inputIdsToIgnoreFilter: Seq[DataObjectId]
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
logWritingFinished(subFeed: S, noData: Option[Boolean], duration: Duration)(implicit context: ActionPipelineContext): Unit
- Attributes
- protected
-
def
logWritingStarted(subFeed: S)(implicit context: ActionPipelineContext): Unit
- Attributes
- protected
-
lazy val
logger: Logger
- Attributes
- protected
- Definition Classes
- SmartDataLakeLogger
- Annotations
- @transient()
- def mainInputId: Option[DataObjectId]
-
lazy val
mainOutput: DataObject
- Attributes
- protected
- def mainOutputId: Option[DataObjectId]
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
nodeId: String
provide an implementation of the DAG node id
provide an implementation of the DAG node id
- Definition Classes
- Action → DAGNode
- Annotations
- @Scaladoc()
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
postExec(inputSubFeeds: Seq[SubFeed], outputSubFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): Unit
Executes operations needed after executing an action.
Executes operations needed after executing an action. In this step any task on Input- or Output-DataObjects needed after the main task is executed, e.g. JdbcTableDataObjects postWriteSql or CopyActions deleteInputData.
- Definition Classes
- ActionSubFeedsImpl → Action
-
def
postExecFailed(implicit context: ActionPipelineContext): Unit
Executes operations needed to cleanup after executing an action failed.
Executes operations needed to cleanup after executing an action failed.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
postprocessOutputSubFeedCustomized(subFeed: S)(implicit context: ActionPipelineContext): S
Implement additional processing logic for SubFeeds after transformation.
Implement additional processing logic for SubFeeds after transformation. Can be implemented by subclass.
- Attributes
- protected
- Annotations
- @Scaladoc()
- def postprocessOutputSubFeeds(subFeeds: Seq[S])(implicit context: ActionPipelineContext): Seq[S]
-
def
preExec(subFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): Unit
Executes operations needed before executing an action.
Executes operations needed before executing an action. In this step any phase on Input- or Output-DataObjects needed before the main task is executed, e.g. JdbcTableDataObjects preWriteSql
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
preInit(subFeeds: Seq[SubFeed], dataObjectsState: Seq[DataObjectState])(implicit context: ActionPipelineContext): Unit
Checks before initalization of Action In this step execution condition is evaluated and Action init is skipped if result is false.
Checks before initalization of Action In this step execution condition is evaluated and Action init is skipped if result is false.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
prepare(implicit context: ActionPipelineContext): Unit
Prepare DataObjects prerequisites.
Prepare DataObjects prerequisites. In this step preconditions are prepared & tested: - connections can be created - needed structures exist, e.g Kafka topic or Jdbc table
This runs during the "prepare" phase of the DAG.
- Definition Classes
- ActionSubFeedsImpl → Action
- def prepareInputSubFeeds(subFeeds: Seq[SubFeed])(implicit context: ActionPipelineContext): (Seq[S], Seq[S])
-
def
preprocessInputSubFeedCustomized(subFeed: S, ignoreFilter: Boolean, isRecursive: Boolean)(implicit context: ActionPipelineContext): S
Implement additional preprocess logic for SubFeeds before transformation Can be implemented by subclass.
Implement additional preprocess logic for SubFeeds before transformation Can be implemented by subclass.
- ignoreFilter
If filters should be ignored for this feed
- isRecursive
If subfeed is recursive (input & output)
- Attributes
- protected
- Annotations
- @Scaladoc()
-
lazy val
prioritizedMainInputCandidates: Seq[DataObject]
- Attributes
- protected
-
def
recursiveInputs: Seq[DataObject]
Recursive Inputs are DataObjects that are used as Output and Input in the same action.
Recursive Inputs are DataObjects that are used as Output and Input in the same action. This is usually prohibited as it creates loops in the DAG. In special cases this makes sense, i.e. when building a complex comparision/update logic.
Usage: add DataObjects used as Output and Input as outputIds and recursiveInputIds, but not as inputIds.
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
def
setSparkJobMetadata(operation: Option[String] = None)(implicit context: ActionPipelineContext): Unit
Sets the util job description for better traceability in the Spark UI
Sets the util job description for better traceability in the Spark UI
Note: This sets Spark local properties, which are propagated to the respective executor tasks. We rely on this to match metrics back to Actions and DataObjects. As writing to a DataObject on the Driver happens uninterrupted in the same exclusive thread, this is suitable.
- operation
phase description (be short...)
- Definition Classes
- Action
- Annotations
- @Scaladoc()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
final
def
toString(executionId: Option[ExecutionId]): String
- Definition Classes
- Action
-
final
def
toString(): String
This is displayed in ascii graph visualization
This is displayed in ascii graph visualization
- Definition Classes
- Action → AnyRef → Any
- Annotations
- @Scaladoc()
-
def
toStringMedium: String
- Definition Classes
- Action
-
def
toStringShort: String
- Definition Classes
- Action
-
def
transformPartitionValues(partitionValues: Seq[PartitionValues])(implicit context: ActionPipelineContext): Map[PartitionValues, PartitionValues]
Transform partition values.
Transform partition values. Can be implemented by subclass.
- Attributes
- protected
- Annotations
- @Scaladoc()
-
def
validateConfig(): Unit
put configuration validation checks here
put configuration validation checks here
- Definition Classes
- ActionSubFeedsImpl → Action
- Annotations
- @Scaladoc()
-
def
validatePartitionValuesExisting(dataObject: DataObject with CanHandlePartitions, subFeed: SubFeed)(implicit context: ActionPipelineContext): Unit
- Attributes
- protected
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
- def writeOutputSubFeeds(subFeeds: Seq[S])(implicit context: ActionPipelineContext): Unit
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated
- Deprecated