package app
- Alphabetic
- Public
- All
Type Members
-
class
DefaultSmartDataLakeBuilder extends SmartDataLakeBuilder
Default Smart Data Lake Command Line Application.
Default Smart Data Lake Command Line Application.
Implementation Note: This must be a class and not an object in order to be found by reflection in DatabricksSmartDataLakeBuilder
- Annotations
- @Scaladoc()
-
case class
GlobalConfig(kryoClasses: Option[Seq[String]] = None, sparkOptions: Option[Map[String, String]] = None, statusInfo: Option[StatusInfoConfig] = None, enableHive: Boolean = true, memoryLogTimer: Option[MemoryLogTimerConfig] = None, shutdownHookLogger: Boolean = false, stateListeners: Seq[StateListenerConfig] = Seq(), sparkUDFs: Option[Map[String, SparkUDFCreatorConfig]] = None, pythonUDFs: Option[Map[String, PythonUDFCreatorConfig]] = None, secretProviders: Option[Map[String, SecretProviderConfig]] = None, allowOverwriteAllPartitionsWithoutPartitionValues: Seq[DataObjectId] = Seq(), synchronousStreamingTriggerIntervalSec: Int = 60) extends SmartDataLakeLogger with Product with Serializable
Global configuration options
Global configuration options
Note that global configuration is responsible to hold SparkSession, so that its created once and only once per SDLB job. This is especially important if JVM is shared between different SDL jobs (e.g. Databricks cluster), because sharing SparkSession in object Environment survives the current SDLB job.
- kryoClasses
classes to register for spark kryo serialization
- sparkOptions
spark options
- statusInfo
enable a REST API providing live status info, see detailed configuration StatusInfoConfig
- enableHive
enable hive for spark session
- memoryLogTimer
enable periodic memory usage logging, see detailed configuration MemoryLogTimerConfig
- shutdownHookLogger
enable shutdown hook logger to trace shutdown cause
- stateListeners
Define state listeners to be registered for receiving events of the execution of SmartDataLake job
- sparkUDFs
Define UDFs to be registered in spark session. The registered UDFs are available in Spark SQL transformations and expression evaluation, e.g. configuration of ExecutionModes.
- pythonUDFs
Define UDFs in python to be registered in spark session. The registered UDFs are available in Spark SQL transformations but not for expression evaluation.
- secretProviders
Define SecretProvider's to be registered.
- allowOverwriteAllPartitionsWithoutPartitionValues
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set. This is used to override/avoid a protective error when using SDLSaveMode.OverwriteOptimized|OverwritePreserveDirectories. Define it as a list of DataObject id's.
- synchronousStreamingTriggerIntervalSec
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile. Note that for asynchronous actions there are separate settings, e.g. SparkStreamingMode.triggerInterval.
- Annotations
- @Scaladoc()
-
case class
MemoryLogTimerConfig(intervalSec: Int, logLinuxMem: Boolean = true, logLinuxCGroupMem: Boolean = false, logBuffers: Boolean = false) extends Product with Serializable
Configuration for periodic memory usage logging
Configuration for periodic memory usage logging
- intervalSec
interval in seconds between memory usage logs
- logLinuxMem
enable logging linux memory
- logLinuxCGroupMem
enable logging details about linux cgroup memory
- logBuffers
enable logging details about different jvm buffers
- Annotations
- @Scaladoc()
-
trait
ModulePlugin extends AnyRef
Hooks for modules to interact with sdl-core
Hooks for modules to interact with sdl-core
- Annotations
- @Scaladoc()
-
trait
SDLPlugin extends AnyRef
SDL Plugin defines an interface to execute custom code on SDL startup and shutdown.
SDL Plugin defines an interface to execute custom code on SDL startup and shutdown. Configure it by setting a java system property "sdl.pluginClassName" to a class name implementing SDLPlugin interface. The class needs to have a constructor without any parameters.
- Annotations
- @Scaladoc() @DeveloperApi()
-
abstract
class
SmartDataLakeBuilder extends SmartDataLakeLogger
Abstract Smart Data Lake Command Line Application.
Abstract Smart Data Lake Command Line Application.
- Annotations
- @Scaladoc()
-
case class
SmartDataLakeBuilderConfig(feedSel: String = null, applicationName: Option[String] = None, configuration: Option[Seq[String]] = None, master: Option[String] = None, deployMode: Option[String] = None, username: Option[String] = None, kerberosDomain: Option[String] = None, keytabPath: Option[File] = None, partitionValues: Option[Seq[PartitionValues]] = None, multiPartitionValues: Option[Seq[PartitionValues]] = None, parallelism: Int = 1, statePath: Option[String] = None, overrideJars: Option[Seq[String]] = None, test: Option[TestMode.Value] = None, streaming: Boolean = false) extends Product with Serializable
This case class represents a default configuration for the App.
This case class represents a default configuration for the App. It is populated by parsing command-line arguments. It also specifies default values.
- feedSel
Expressions to select the actions to execute. See AppUtil.filterActionList() or commandline help for syntax description.
- applicationName
Application name.
- configuration
One or multiple configuration files or directories containing configuration files, separated by comma.
- master
The Spark master URL passed to SparkContext when in local mode.
- deployMode
The Spark deploy mode passed to SparkContext when in local mode.
- username
Kerberos user name (
username@kerberosDomain) for local mode.- kerberosDomain
Kerberos domain (
username@kerberosDomain) for local mode.- keytabPath
Path to Kerberos keytab file for local mode.
- test
Run in test mode:
- "config": validate configuration
- "dry-run": execute "prepare" and "init" phase to check environment
- Annotations
- @Scaladoc()
-
trait
StateListener extends AnyRef
Interface to notify interested parties about action results & metric
Interface to notify interested parties about action results & metric
- Annotations
- @Scaladoc()
-
case class
StateListenerConfig(className: String, options: Option[Map[String, String]] = None) extends Product with Serializable
Configuration to notify interested parties about action results & metric
Configuration to notify interested parties about action results & metric
- className
fully qualified class name of class implementing StateListener interface. The class needs a constructor with one parameter
options: Map[String,String].- options
Options are passed to StateListener constructor.
- Annotations
- @Scaladoc()
-
case class
StatusInfoConfig(port: Int = 4440, maxPortRetries: Int = 10, stopOnEnd: Boolean = true) extends Product with Serializable
Configuration for the Server that provides live status info of the current DAG Execution
Configuration for the Server that provides live status info of the current DAG Execution
- port
: port with which the first connection attempt is made
- maxPortRetries
: If port is already in use, we will increment port by one and try with that new port. maxPortRetries describes how many times this should be attempted. If set to 0 it will not be attempted. Values below 0 are not allowed.
- stopOnEnd
: Set to false if the Server should remain online even after SDL has finished its execution. In that case, the Application needs to be stopped manually. Useful for debugging.
- Annotations
- @Scaladoc()
Value Members
-
object
DatabricksSmartDataLakeBuilder extends SmartDataLakeBuilder
Databricks Smart Data Lake Command Line Application.
Databricks Smart Data Lake Command Line Application.
As there is an old version of config-*.jar deployed on Databricks, this special App uses a ChildFirstClassLoader to override it in the classpath.
- object DefaultSmartDataLakeBuilder
- object GlobalConfig extends ConfigImplicits with Serializable
-
object
LocalSmartDataLakeBuilder extends SmartDataLakeBuilder
Smart Data Lake Builder application for local mode.
Smart Data Lake Builder application for local mode.
Sets master to local[*] and deployMode to client by default.
- object ModulePlugin
- object TestMode extends Enumeration