case class GlobalConfig(kryoClasses: Option[Seq[String]] = None, sparkOptions: Option[Map[String, String]] = None, statusInfo: Option[StatusInfoConfig] = None, enableHive: Boolean = true, memoryLogTimer: Option[MemoryLogTimerConfig] = None, shutdownHookLogger: Boolean = false, stateListeners: Seq[StateListenerConfig] = Seq(), sparkUDFs: Option[Map[String, SparkUDFCreatorConfig]] = None, pythonUDFs: Option[Map[String, PythonUDFCreatorConfig]] = None, secretProviders: Option[Map[String, SecretProviderConfig]] = None, allowOverwriteAllPartitionsWithoutPartitionValues: Seq[DataObjectId] = Seq(), synchronousStreamingTriggerIntervalSec: Int = 60) extends SmartDataLakeLogger with Product with Serializable
Global configuration options
Note that global configuration is responsible to hold SparkSession, so that its created once and only once per SDLB job. This is especially important if JVM is shared between different SDL jobs (e.g. Databricks cluster), because sharing SparkSession in object Environment survives the current SDLB job.
- kryoClasses
classes to register for spark kryo serialization
- sparkOptions
spark options
- statusInfo
enable a REST API providing live status info, see detailed configuration StatusInfoConfig
- enableHive
enable hive for spark session
- memoryLogTimer
enable periodic memory usage logging, see detailed configuration MemoryLogTimerConfig
- shutdownHookLogger
enable shutdown hook logger to trace shutdown cause
- stateListeners
Define state listeners to be registered for receiving events of the execution of SmartDataLake job
- sparkUDFs
Define UDFs to be registered in spark session. The registered UDFs are available in Spark SQL transformations and expression evaluation, e.g. configuration of ExecutionModes.
- pythonUDFs
Define UDFs in python to be registered in spark session. The registered UDFs are available in Spark SQL transformations but not for expression evaluation.
- secretProviders
Define SecretProvider's to be registered.
- allowOverwriteAllPartitionsWithoutPartitionValues
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set. This is used to override/avoid a protective error when using SDLSaveMode.OverwriteOptimized|OverwritePreserveDirectories. Define it as a list of DataObject id's.
- synchronousStreamingTriggerIntervalSec
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile. Note that for asynchronous actions there are separate settings, e.g. SparkStreamingMode.triggerInterval.
- Annotations
- @Scaladoc()
- Alphabetic
- By Inheritance
- GlobalConfig
- Serializable
- Serializable
- Product
- Equals
- SmartDataLakeLogger
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
GlobalConfig(kryoClasses: Option[Seq[String]] = None, sparkOptions: Option[Map[String, String]] = None, statusInfo: Option[StatusInfoConfig] = None, enableHive: Boolean = true, memoryLogTimer: Option[MemoryLogTimerConfig] = None, shutdownHookLogger: Boolean = false, stateListeners: Seq[StateListenerConfig] = Seq(), sparkUDFs: Option[Map[String, SparkUDFCreatorConfig]] = None, pythonUDFs: Option[Map[String, PythonUDFCreatorConfig]] = None, secretProviders: Option[Map[String, SecretProviderConfig]] = None, allowOverwriteAllPartitionsWithoutPartitionValues: Seq[DataObjectId] = Seq(), synchronousStreamingTriggerIntervalSec: Int = 60)
- kryoClasses
classes to register for spark kryo serialization
- sparkOptions
spark options
- statusInfo
enable a REST API providing live status info, see detailed configuration StatusInfoConfig
- enableHive
enable hive for spark session
- memoryLogTimer
enable periodic memory usage logging, see detailed configuration MemoryLogTimerConfig
- shutdownHookLogger
enable shutdown hook logger to trace shutdown cause
- stateListeners
Define state listeners to be registered for receiving events of the execution of SmartDataLake job
- sparkUDFs
Define UDFs to be registered in spark session. The registered UDFs are available in Spark SQL transformations and expression evaluation, e.g. configuration of ExecutionModes.
- pythonUDFs
Define UDFs in python to be registered in spark session. The registered UDFs are available in Spark SQL transformations but not for expression evaluation.
- secretProviders
Define SecretProvider's to be registered.
- allowOverwriteAllPartitionsWithoutPartitionValues
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set. This is used to override/avoid a protective error when using SDLSaveMode.OverwriteOptimized|OverwritePreserveDirectories. Define it as a list of DataObject id's.
- synchronousStreamingTriggerIntervalSec
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile. Note that for asynchronous actions there are separate settings, e.g. SparkStreamingMode.triggerInterval.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val allowOverwriteAllPartitionsWithoutPartitionValues: Seq[DataObjectId]
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native() @HotSpotIntrinsicCandidate()
- val enableHive: Boolean
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
getHadoopConfiguration: Configuration
Get Hadoop configuration as Spark would see it.
Get Hadoop configuration as Spark would see it. This is using potential hadoop properties defined in sparkOptions.
- Annotations
- @Scaladoc()
-
def
hasSparkSession: Boolean
True if a SparkSession has been created in this job
True if a SparkSession has been created in this job
- Annotations
- @Scaladoc()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- val kryoClasses: Option[Seq[String]]
-
lazy val
logger: Logger
- Attributes
- protected
- Definition Classes
- SmartDataLakeLogger
- Annotations
- @transient()
- val memoryLogTimer: Option[MemoryLogTimerConfig]
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
- val pythonUDFs: Option[Map[String, PythonUDFCreatorConfig]]
- val secretProviders: Option[Map[String, SecretProviderConfig]]
- val shutdownHookLogger: Boolean
- val sparkOptions: Option[Map[String, String]]
-
def
sparkSession(appName: String, master: Option[String], deployMode: Option[String] = None): SparkSession
Return SparkSession Create SparkSession if not yet done, but only if it is used.
Return SparkSession Create SparkSession if not yet done, but only if it is used.
- Annotations
- @Scaladoc()
- val sparkUDFs: Option[Map[String, SparkUDFCreatorConfig]]
- val stateListeners: Seq[StateListenerConfig]
- val statusInfo: Option[StatusInfoConfig]
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
- val synchronousStreamingTriggerIntervalSec: Int
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
Deprecated Value Members
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] ) @Deprecated
- Deprecated