Class

com.coxautodata.waimak.storage.StorageActions

StorageActionImplicits

Related Doc: package StorageActions

Permalink

implicit class StorageActionImplicits extends AnyRef

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. StorageActionImplicits
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new StorageActionImplicits(sparkDataFlow: SparkDataFlow)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. def getAuditTable(storageBasePath: String, labelPrefix: Option[String] = Some("audittable"), includeHot: Boolean = true)(tableNames: String*): SparkDataFlow

    Permalink

    Opens a storage layer table and adds the AuditTable object to the flow with a given label.

    Opens a storage layer table and adds the AuditTable object to the flow with a given label. This can then be used with the writeToStorage action. Fails if the table does not exist in the storage layer.

    storageBasePath

    the base path of the storage layer

    labelPrefix

    optionally prefix the output label for the AuditTable. If set, the label of the AuditTable will be s"${labelPrefix}_$table"

    includeHot

    whether or not to include hot partitions in the read

    tableNames

    the tables we want to open in the storage layer

    returns

    a new SparkDataFlow with the get action added

  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. def getOrCreateAuditTable(storageBasePath: String, metadataRetrieval: Option[(String) ⇒ AuditTableInfo] = None, labelPrefix: Option[String] = Some("audittable"), includeHot: Boolean = true, updateTableMetadata: ⇒ Boolean = ...)(tableNames: String*): SparkDataFlow

    Permalink

    Opens or creates a storage layer table and adds the AuditTable object to the flow with a given label.

    Opens or creates a storage layer table and adds the AuditTable object to the flow with a given label. This can then be used with the writeToStorage action. Creates a table if it does not already exist in the storage layer and the optional metadataRetrieval function is given. Fails if the table does not exist in the storage layer and the optional metadataRetrieval function is not given.

    storageBasePath

    the base path of the storage layer

    metadataRetrieval

    an optional function that generates table metadata from a table name. This function is used during table creation if a table does not exist in the storage layer or to update the metadata if updateTableMetadata is set to true

    labelPrefix

    optionally prefix the output label for the AuditTable. If set, the label of the AuditTable will be s"${labelPrefix}_$table"

    includeHot

    whether or not to include hot partitions in the read

    updateTableMetadata

    whether or not to update the table metadata. Uses spark.waimak.storage.updateMetadata by default (which defaults to false)

    tableNames

    the tables we want to open in the storage layer

    returns

    a new SparkDataFlow with the get action added

  12. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  14. def loadFromStorage(storageBasePath: String, from: Option[Timestamp] = None, to: Option[Timestamp] = None, includeHot: Boolean = true, outputPrefix: Option[String] = None)(tables: String*): SparkDataFlow

    Permalink

    Load everything between two timestamps for the given tables

    Load everything between two timestamps for the given tables

    NB; this will not give you a snapshot of the tables at a given time, it will give you the entire history of events which have occurred between the provided dates for each table. To get a snapshot, use snapshotFromStorage

    storageBasePath

    the base path of the storage layer

    from

    Optionally, the lower bound last updated timestamp (if undefined, it will read from the beginning of time)

    to

    Optionally, the upper bound last updated timestamp (if undefined, it will read up until the most recent events)

    tables

    the tables to load

    returns

    a new SparkDataFlow with the read actions added

  15. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  16. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. def snapshotFromStorage(storageBasePath: String, snapshotTimestamp: Timestamp, includeHot: Boolean = true, outputPrefix: Option[String] = None)(tables: String*): SparkDataFlow

    Permalink

    Get a snapshot of tables in the storage layer for a given timestamp

    Get a snapshot of tables in the storage layer for a given timestamp

    storageBasePath

    the base path of the storage layer

    snapshotTimestamp

    the snapshot timestamp

    includeHot

    whether or not to include hot partitions in the read

    outputPrefix

    optionally prefix the output label for the Dataset. If set, the label of the snapshot Dataset will be s"${outputPrefix}_$table"

    tables

    the tables we want to snapshot

    returns

    a new SparkDataFlow with the snapshot actions added

  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  20. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  21. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. def writeToStorage(labelName: String, lastUpdatedCol: String, appendDateTime: ZonedDateTime, doCompaction: (Seq[AuditTableRegionInfo], Long, ZonedDateTime) ⇒ Boolean = (_, _, _) => false, auditTableLabelPrefix: String = "audittable"): SparkDataFlow

    Permalink

    Writes a Dataset to the storage layer.

    Writes a Dataset to the storage layer. The table must have been already opened on the flow by using either the getOrCreateAuditTable or getAuditTable actions.

    labelName

    the label whose Dataset we wish to write

    lastUpdatedCol

    the last updated column in the Dataset

    appendDateTime

    timestamp of the append, zoned to a timezone

    doCompaction

    a lambda used to decide whether a compaction should happen after an append. Takes list of table regions, the count of records added in this batch and the compaction zoned date time. Default is not to trigger a compaction.

    auditTableLabelPrefix

    the prefix of the audit table entity on the flow. The AuditTable will be found with s"${auditTableLabelPrefix}_$labelName"

    returns

    a new SparkDataFlow with the write action added

Inherited from AnyRef

Inherited from Any

Ungrouped