package sources
- Alphabetic
- Public
- Protected
Type Members
- case class CompositeLimit(bytes: ReadMaxBytes, files: ReadMaxFiles) extends ReadLimit with Product with Serializable
A read limit that admits the given soft-max of
bytesor maxfiles. - class DeltaDataSource extends RelationProvider with StreamSourceProvider with StreamSinkProvider with CreatableRelationProvider with DataSourceRegister with TableProvider with DeltaLogging
A DataSource V1 for integrating Delta into Spark SQL batch and Streaming APIs.
- trait DeltaSQLConfBase extends AnyRef
SQLConf entries for Delta features.
- class DeltaSink extends Sink with ImplicitMetadataOperation with DeltaLogging
A streaming sink that writes data into a Delta Table.
- case class DeltaSource(spark: SparkSession, deltaLog: DeltaLog, options: DeltaOptions, filters: Seq[Expression] = Nil) extends DeltaSourceBase with DeltaSourceCDCSupport with Product with Serializable
A streaming source for a Delta table.
A streaming source for a Delta table.
When a new stream is started, delta starts by constructing a org.apache.spark.sql.delta.Snapshot at the current version of the table. This snapshot is broken up into batches until all existing data has been processed. Subsequent processing is done by tailing the change log looking for new data. This results in the streaming query returning the same answer as a batch query that had processed the entire dataset at any given point.
- trait DeltaSourceBase extends Source with SupportsAdmissionControl with DeltaLogging
Base trait for the Delta Source, that contains methods that deal with getting changes from the delta log.
- trait DeltaSourceCDCSupport extends AnyRef
Helper functions for CDC-specific handling for DeltaSource.
- case class DeltaSourceOffset(sourceVersion: Long, reservoirId: String, reservoirVersion: Long, index: Long, isStartingVersion: Boolean) extends Offset with Product with Serializable
Tracks how far we processed in when reading changes from the DeltaLog.
Tracks how far we processed in when reading changes from the DeltaLog.
Note this class retains the naming of
Reservoirto maintain compatibility with serialized offsets from the beta period.- sourceVersion
The version of serialization that this offset is encoded with.
- reservoirId
The id of the table we are reading from. Used to detect misconfiguration when restarting a query.
- reservoirVersion
The version of the table that we are current processing.
- index
The index in the sequence of AddFiles in this version. Used to break large commits into multiple batches. This index is created by sorting on modificationTimestamp and path.
- isStartingVersion
Whether this offset denotes a query that is starting rather than processing changes. When starting a new query, we first process all data present in the table at the start and then move on to processing new data that has arrived.
- case class ReadMaxBytes(maxBytes: Long) extends ReadLimit with Product with Serializable
A read limit that admits a soft-max of
maxBytesper micro-batch.
Value Members
- object DeltaDataSource extends DatabricksLogging
- object DeltaSQLConf extends DeltaSQLConfBase
- object DeltaSource extends Serializable
- object DeltaSourceOffset extends Serializable
- object DeltaSourceUtils