Packages

package sources

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. Protected

Type Members

  1. case class CompositeLimit(bytes: ReadMaxBytes, files: ReadMaxFiles) extends ReadLimit with Product with Serializable

    A read limit that admits the given soft-max of bytes or max files.

  2. class DeltaDataSource extends RelationProvider with StreamSourceProvider with StreamSinkProvider with CreatableRelationProvider with DataSourceRegister with TableProvider with DeltaLogging

    A DataSource V1 for integrating Delta into Spark SQL batch and Streaming APIs.

  3. trait DeltaSQLConfBase extends AnyRef

    SQLConf entries for Delta features.

  4. class DeltaSink extends Sink with ImplicitMetadataOperation with DeltaLogging

    A streaming sink that writes data into a Delta Table.

  5. case class DeltaSource(spark: SparkSession, deltaLog: DeltaLog, options: DeltaOptions, filters: Seq[Expression] = Nil) extends DeltaSourceBase with DeltaSourceCDCSupport with Product with Serializable

    A streaming source for a Delta table.

    A streaming source for a Delta table.

    When a new stream is started, delta starts by constructing a org.apache.spark.sql.delta.Snapshot at the current version of the table. This snapshot is broken up into batches until all existing data has been processed. Subsequent processing is done by tailing the change log looking for new data. This results in the streaming query returning the same answer as a batch query that had processed the entire dataset at any given point.

  6. trait DeltaSourceBase extends Source with SupportsAdmissionControl with DeltaLogging

    Base trait for the Delta Source, that contains methods that deal with getting changes from the delta log.

  7. trait DeltaSourceCDCSupport extends AnyRef

    Helper functions for CDC-specific handling for DeltaSource.

  8. case class DeltaSourceOffset(sourceVersion: Long, reservoirId: String, reservoirVersion: Long, index: Long, isStartingVersion: Boolean) extends Offset with Product with Serializable

    Tracks how far we processed in when reading changes from the DeltaLog.

    Tracks how far we processed in when reading changes from the DeltaLog.

    Note this class retains the naming of Reservoir to maintain compatibility with serialized offsets from the beta period.

    sourceVersion

    The version of serialization that this offset is encoded with.

    reservoirId

    The id of the table we are reading from. Used to detect misconfiguration when restarting a query.

    reservoirVersion

    The version of the table that we are current processing.

    index

    The index in the sequence of AddFiles in this version. Used to break large commits into multiple batches. This index is created by sorting on modificationTimestamp and path.

    isStartingVersion

    Whether this offset denotes a query that is starting rather than processing changes. When starting a new query, we first process all data present in the table at the start and then move on to processing new data that has arrived.

  9. case class ReadMaxBytes(maxBytes: Long) extends ReadLimit with Product with Serializable

    A read limit that admits a soft-max of maxBytes per micro-batch.

Ungrouped