Packages

package optimize

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. Protected

Type Members

  1. case class FileSizeMetrics(min: Option[Long], max: Option[Long], avg: Double, totalFiles: Long, totalSize: Long) extends Product with Serializable

    Basic Stats on file sizes.

    Basic Stats on file sizes.

    min

    Minimum file size

    max

    Maximum file size

    avg

    Average of the file size

    totalFiles

    Total number of files

    totalSize

    Total size of the files

  2. case class FileSizeStats(minFileSize: Long = 0, maxFileSize: Long = 0, totalFiles: Long = 0, totalSize: Long = 0) extends Product with Serializable
  3. case class FileSizeStatsWithHistogram(min: Long, p25: Long, p50: Long, p75: Long, max: Long) extends Product with Serializable

    Percentiles on the file sizes in this batch.

    Percentiles on the file sizes in this batch.

    min

    Size of the smallest file

    p25

    Size of the 25th percentile file

    p50

    Size of the 50th percentile file

    p75

    Size of the 75th percentile file

    max

    Size of the largest file

  4. case class OptimizeMetrics(numFilesAdded: Long, numFilesRemoved: Long, filesAdded: FileSizeMetrics = FileSizeMetrics(min = None, max = None, avg = 0, totalFiles = 0, totalSize = 0), filesRemoved: FileSizeMetrics = FileSizeMetrics(min = None, max = None, avg = 0, totalFiles = 0, totalSize = 0), partitionsOptimized: Long = 0, zOrderStats: Option[ZOrderStats] = None, numBatches: Long, totalConsideredFiles: Long, totalFilesSkipped: Long = 0, preserveInsertionOrder: Boolean = false, numFilesSkippedToReduceWriteAmplification: Long = 0, numBytesSkippedToReduceWriteAmplification: Long = 0, startTimeMs: Long = 0, endTimeMs: Long = 0) extends Product with Serializable

    Metrics returned by the optimize command.

    Metrics returned by the optimize command.

    numFilesAdded

    number of files added by optimize

    numFilesRemoved

    number of files removed by optimize

    filesAdded

    Stats for the files added

    filesRemoved

    Stats for the files removed

    partitionsOptimized

    Number of partitions optimized

    zOrderStats

    Z-Order stats

    numBatches

    Number of batches

    totalConsideredFiles

    Number of files considered for the Optimize operation.

    totalFilesSkipped

    Number of files that are skipped from being Optimized.

    preserveInsertionOrder

    If optimize was run with insertion preservation enabled.

    numFilesSkippedToReduceWriteAmplification

    Number of files skipped for reducing write amplification.

    numBytesSkippedToReduceWriteAmplification

    Number of bytes skipped for reducing write amplification.

    startTimeMs

    The start time of Optimize command.

    endTimeMs

    The end time of Optimize command.

  5. case class OptimizeStats(addedFilesSizeStats: FileSizeStats = FileSizeStats(), removedFilesSizeStats: FileSizeStats = FileSizeStats(), numPartitionsOptimized: Long = 0, zOrderStats: Option[ZOrderStats] = None, numBatches: Long = 0, totalConsideredFiles: Long = 0, totalFilesSkipped: Long = 0, preserveInsertionOrder: Boolean = false, numFilesSkippedToReduceWriteAmplification: Long = 0, numBytesSkippedToReduceWriteAmplification: Long = 0, startTimeMs: Long = System.currentTimeMillis(), endTimeMs: Long = 0) extends Product with Serializable

    Stats for an OPTIMIZE operation accumulated across all batches.

  6. case class ZOrderFileStats(num: Long, size: Long) extends Product with Serializable

    Aggregated file stats for a category of ZCube files.

    Aggregated file stats for a category of ZCube files.

    num

    Total number of files.

    size

    Total size of files in bytes.

  7. case class ZOrderStats(strategyName: String, inputCubeFiles: ZOrderFileStats, inputOtherFiles: ZOrderFileStats, inputNumCubes: Long, mergedFiles: ZOrderFileStats, numOutputCubes: Long, mergedNumCubes: Option[Long] = None) extends Product with Serializable

    Aggregated stats for OPTIMIZE ZORDERBY command.

    Aggregated stats for OPTIMIZE ZORDERBY command. This is a public facing API, consider any change carefully.

    strategyName

    ZCubeMergeStrategy used.

    inputCubeFiles

    Files in the ZCube matching the current OPTIMIZE operation.

    inputOtherFiles

    Files not in any ZCube or in other ZCube orderings.

    inputNumCubes

    Number of different cubes among input files.

    mergedFiles

    Subset of input files merged by the current operation

    numOutputCubes

    Number of output ZCubes written out

    mergedNumCubes

    Number of different cubes among merged files.

Value Members

  1. object FileSizeStatsWithHistogram extends Serializable
  2. object ZOrderFileStats extends Serializable

Ungrouped