Packages

t

org.apache.spark.sql.delta.files

TransactionalWrite

trait TransactionalWrite extends DeltaLogging

Adds the ability to write files out as part of a transaction. Checks are performed to ensure that the data being written matches either the current metadata or the new metadata being set by this transaction.

Self Type
OptimisticTransactionImpl
Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TransactionalWrite
  2. DeltaLogging
  3. DatabricksLogging
  4. DeltaProgressReporter
  5. Logging
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Abstract Value Members

  1. abstract def deltaLog: DeltaLog
  2. abstract def metadata: Metadata
    Attributes
    protected
  3. abstract def protocol: Protocol
  4. abstract def snapshot: Snapshot
    Attributes
    protected

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def checkPartitionColumns(partitionSchema: StructType, output: Seq[Attribute], colsDropped: Boolean): Unit
    Attributes
    protected
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  7. def convertEmptyToNullIfNeeded(plan: SparkPlan, partCols: Seq[Attribute], constraints: Seq[Constraint]): SparkPlan

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column.

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column. The empty strings will be converted to null eventually even without this convert, but we want to do this earlier before check constraints so that empty strings are correctly rejected. Note that this should not cause the downstream logic in FileFormatWriter to add duplicate conversions because the logic there checks the partition column using the original plan's output. When the plan is modified with additional projections, the partition column check won't match and will not add more conversion.

    plan

    The original SparkPlan.

    partCols

    The partition columns.

    constraints

    The defined constraints.

    returns

    A SparkPlan potentially modified with an additional projection on top of plan

    Attributes
    protected
  8. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  9. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  11. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def getCommitter(outputPath: Path): DelayedCommitProtocol
    Attributes
    protected
  13. def getPartitioningColumns(partitionSchema: StructType, output: Seq[Attribute]): Seq[Attribute]
    Attributes
    protected
  14. val hasWritten: Boolean
    Attributes
    protected
  15. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  16. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  17. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  18. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  19. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  20. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  21. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  22. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  23. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  24. def logError(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  25. def logError(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  26. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  27. def logInfo(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  29. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  30. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def logWarning(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logWarning(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  33. def makeOutputNullable(output: Seq[Attribute]): Seq[Attribute]

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Attributes
    protected
  34. def mapColumnAttributes(output: Seq[Attribute], mappingMode: DeltaColumnMappingMode): Seq[Attribute]

    Replace the output attributes with the physical mapping information.

    Replace the output attributes with the physical mapping information.

    Attributes
    protected
  35. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  36. def normalizeData(deltaLog: DeltaLog, data: Dataset[_]): (QueryExecution, Seq[Attribute], Seq[Constraint], Set[String])

    Normalize the schema of the query, and return the QueryExecution to execute.

    Normalize the schema of the query, and return the QueryExecution to execute. If the table has generated columns and users provide these columns in the output, we will also return constraints that should be respected. If any constraints are returned, the caller should apply these constraints when writing data.

    Note: The output attributes of the QueryExecution may not match the attributes we return as the output schema. This is because streaming queries create IncrementalExecution, which cannot be further modified. We can however have the Parquet writer use the physical plan from IncrementalExecution and the output schema provided through the attributes.

    Attributes
    protected
  37. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  38. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  39. def performCDCPartition(inputData: Dataset[_]): (DataFrame, StructType)

    Returns a tuple of (data, partition schema).

    Returns a tuple of (data, partition schema). For CDC writes, a is_cdc column is added to the data and is_cdc=true/false is added to the front of the partition schema.

    Attributes
    protected
  40. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  41. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  42. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  43. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  44. def recordFrameProfile[T](group: String, name: String)(thunk: => T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  45. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = null, silent: Boolean = true)(thunk: => S): S
    Definition Classes
    DatabricksLogging
  46. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  47. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  48. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  49. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  50. def toString(): String
    Definition Classes
    AnyRef → Any
  51. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  52. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  53. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  54. def withDmqTag[T](thunk: => T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  55. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: => T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  56. def writeFiles(inputData: Dataset[_], writeOptions: Option[DeltaOptions], additionalConstraints: Seq[Constraint]): Seq[FileAction]

    Writes out the dataframe after performing schema validation.

    Writes out the dataframe after performing schema validation. Returns a list of actions to append these files to the reservoir.

  57. def writeFiles(data: Dataset[_]): Seq[FileAction]
  58. def writeFiles(data: Dataset[_], writeOptions: Option[DeltaOptions]): Seq[FileAction]
  59. def writeFiles(data: Dataset[_], additionalConstraints: Seq[Constraint]): Seq[FileAction]

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped