Packages

c

com.nvidia.spark.rapids

GpuOutOfCoreSortIterator

case class GpuOutOfCoreSortIterator(iter: Iterator[ColumnarBatch], sorter: GpuSorter, targetSize: Long, opTime: GpuMetric, sortTime: GpuMetric, outputBatches: GpuMetric, outputRows: GpuMetric) extends Iterator[ColumnarBatch] with AutoCloseable with Product with Serializable

Sorts incoming batches of data spilling if needed.
The algorithm for this is a modified version of an external merge sort with multiple passes for large data. https://en.wikipedia.org/wiki/External_sorting#External_merge_sort
The main difference is that we cannot stream the data when doing a merge sort. So, we instead divide the data into batches that are small enough that we can do a merge sort on N batches and still fit the output within the target batch size. When merging batches instead of individual rows we cannot assume that all of the resulting data is globally sorted. Hopefully, most of it is globally sorted but we have to use the first row from the next pending batch to determine the cutoff point between globally sorted data and data that still needs to be merged with other batches. The globally sorted portion is put into a sorted queue while the rest of the merged data is split and put back into a pending queue. The process repeats until we have enough data to output.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GpuOutOfCoreSortIterator
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AutoCloseable
  7. Iterator
  8. TraversableOnce
  9. GenTraversableOnce
  10. AnyRef
  11. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GpuOutOfCoreSortIterator(iter: Iterator[ColumnarBatch], sorter: GpuSorter, targetSize: Long, opTime: GpuMetric, sortTime: GpuMetric, outputBatches: GpuMetric, outputRows: GpuMetric)

Type Members

  1. class GroupedIterator[B >: A] extends AbstractIterator[Seq[B]] with Iterator[Seq[B]]
    Definition Classes
    Iterator

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def ++[B >: ColumnarBatch](that: ⇒ GenTraversableOnce[B]): Iterator[B]
    Definition Classes
    Iterator
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. def addString(b: StringBuilder): StringBuilder
    Definition Classes
    TraversableOnce
  6. def addString(b: StringBuilder, sep: String): StringBuilder
    Definition Classes
    TraversableOnce
  7. def addString(b: StringBuilder, start: String, sep: String, end: String): StringBuilder
    Definition Classes
    TraversableOnce
  8. def aggregate[B](z: ⇒ B)(seqop: (B, ColumnarBatch) ⇒ B, combop: (B, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  9. val alreadySortedIter: Iterator[SpillableColumnarBatch]

    This has already sorted the data, and it still has the projected columns in it that need to be removed before it is returned.

  10. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  11. def buffered: BufferedIterator[ColumnarBatch]
    Definition Classes
    Iterator
  12. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  13. def close(): Unit
    Definition Classes
    GpuOutOfCoreSortIterator → AutoCloseable
  14. def collect[B](pf: PartialFunction[ColumnarBatch, B]): Iterator[B]
    Definition Classes
    Iterator
    Annotations
    @migration
    Migration

    (Changed in version 2.8.0) collect has changed. The previous behavior can be reproduced with toSeq.

  15. def collectFirst[B](pf: PartialFunction[ColumnarBatch, B]): Option[B]
    Definition Classes
    TraversableOnce
  16. def contains(elem: Any): Boolean
    Definition Classes
    Iterator
  17. def copyToArray[B >: ColumnarBatch](xs: Array[B], start: Int, len: Int): Unit
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  18. def copyToArray[B >: ColumnarBatch](xs: Array[B]): Unit
    Definition Classes
    TraversableOnce → GenTraversableOnce
  19. def copyToArray[B >: ColumnarBatch](xs: Array[B], start: Int): Unit
    Definition Classes
    TraversableOnce → GenTraversableOnce
  20. def copyToBuffer[B >: ColumnarBatch](dest: Buffer[B]): Unit
    Definition Classes
    TraversableOnce
  21. def corresponds[B](that: GenTraversableOnce[B])(p: (ColumnarBatch, B) ⇒ Boolean): Boolean
    Definition Classes
    Iterator
  22. def count(p: (ColumnarBatch) ⇒ Boolean): Int
    Definition Classes
    TraversableOnce → GenTraversableOnce
  23. def drop(n: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  24. def dropWhile(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  25. def duplicate: (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  26. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  27. def exists(p: (ColumnarBatch) ⇒ Boolean): Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  28. def filter(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  29. def filterNot(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  30. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  31. def find(p: (ColumnarBatch) ⇒ Boolean): Option[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  32. def flatMap[B](f: (ColumnarBatch) ⇒ GenTraversableOnce[B]): Iterator[B]
    Definition Classes
    Iterator
  33. def fold[A1 >: ColumnarBatch](z: A1)(op: (A1, A1) ⇒ A1): A1
    Definition Classes
    TraversableOnce → GenTraversableOnce
  34. def foldLeft[B](z: B)(op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  35. def foldRight[B](z: B)(op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  36. def forall(p: (ColumnarBatch) ⇒ Boolean): Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  37. def foreach[U](f: (ColumnarBatch) ⇒ U): Unit
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  38. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  39. def grouped[B >: ColumnarBatch](size: Int): GroupedIterator[B]
    Definition Classes
    Iterator
  40. def hasDefiniteSize: Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  41. def hasNext: Boolean
    Definition Classes
    GpuOutOfCoreSortIterator → Iterator
  42. def indexOf[B >: ColumnarBatch](elem: B, from: Int): Int
    Definition Classes
    Iterator
  43. def indexOf[B >: ColumnarBatch](elem: B): Int
    Definition Classes
    Iterator
  44. def indexWhere(p: (ColumnarBatch) ⇒ Boolean, from: Int): Int
    Definition Classes
    Iterator
  45. def indexWhere(p: (ColumnarBatch) ⇒ Boolean): Int
    Definition Classes
    Iterator
  46. def isEmpty: Boolean
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  47. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  48. def isTraversableAgain: Boolean
    Definition Classes
    Iterator → GenTraversableOnce
  49. val iter: Iterator[ColumnarBatch]
  50. def length: Int
    Definition Classes
    Iterator
  51. def map[B](f: (ColumnarBatch) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  52. def max[B >: ColumnarBatch](implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  53. def maxBy[B](f: (ColumnarBatch) ⇒ B)(implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  54. def min[B >: ColumnarBatch](implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  55. def minBy[B](f: (ColumnarBatch) ⇒ B)(implicit cmp: Ordering[B]): ColumnarBatch
    Definition Classes
    TraversableOnce → GenTraversableOnce
  56. def mkString: String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  57. def mkString(sep: String): String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  58. def mkString(start: String, sep: String, end: String): String
    Definition Classes
    TraversableOnce → GenTraversableOnce
  59. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  60. def next(): ColumnarBatch
    Definition Classes
    GpuOutOfCoreSortIterator → Iterator
  61. def nonEmpty: Boolean
    Definition Classes
    TraversableOnce → GenTraversableOnce
  62. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  63. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  64. def onConcatOutput(): Unit
    Attributes
    protected
  65. def onFirstPassSplit(): Unit

    Callbacks designed for unit tests only.

    Callbacks designed for unit tests only. Don't do any heavy things inside.

    Attributes
    protected
  66. def onMergeSortSplit(): Unit
    Attributes
    protected
  67. val opTime: GpuMetric
  68. val outputBatches: GpuMetric
  69. val outputRows: GpuMetric
  70. def padTo[A1 >: ColumnarBatch](len: Int, elem: A1): Iterator[A1]
    Definition Classes
    Iterator
  71. def partition(p: (ColumnarBatch) ⇒ Boolean): (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  72. def patch[B >: ColumnarBatch](from: Int, patchElems: Iterator[B], replaced: Int): Iterator[B]
    Definition Classes
    Iterator
  73. def product[B >: ColumnarBatch](implicit num: Numeric[B]): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  74. def reduce[A1 >: ColumnarBatch](op: (A1, A1) ⇒ A1): A1
    Definition Classes
    TraversableOnce → GenTraversableOnce
  75. def reduceLeft[B >: ColumnarBatch](op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce
  76. def reduceLeftOption[B >: ColumnarBatch](op: (B, ColumnarBatch) ⇒ B): Option[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  77. def reduceOption[A1 >: ColumnarBatch](op: (A1, A1) ⇒ A1): Option[A1]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  78. def reduceRight[B >: ColumnarBatch](op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  79. def reduceRightOption[B >: ColumnarBatch](op: (ColumnarBatch, B) ⇒ B): Option[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  80. def reversed: List[ColumnarBatch]
    Attributes
    protected[this]
    Definition Classes
    TraversableOnce
  81. def sameElements(that: Iterator[_]): Boolean
    Definition Classes
    Iterator
  82. def scanLeft[B](z: B)(op: (B, ColumnarBatch) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  83. def scanRight[B](z: B)(op: (ColumnarBatch, B) ⇒ B): Iterator[B]
    Definition Classes
    Iterator
  84. def seq: Iterator[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  85. def size: Int
    Definition Classes
    TraversableOnce → GenTraversableOnce
  86. def sizeHintIfCheap: Int
    Attributes
    protected[collection]
    Definition Classes
    GenTraversableOnce
  87. def slice(from: Int, until: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  88. def sliceIterator(from: Int, until: Int): Iterator[ColumnarBatch]
    Attributes
    protected
    Definition Classes
    Iterator
  89. def sliding[B >: ColumnarBatch](size: Int, step: Int): GroupedIterator[B]
    Definition Classes
    Iterator
  90. val sortTime: GpuMetric
  91. val sorter: GpuSorter
  92. def span(p: (ColumnarBatch) ⇒ Boolean): (Iterator[ColumnarBatch], Iterator[ColumnarBatch])
    Definition Classes
    Iterator
  93. def sum[B >: ColumnarBatch](implicit num: Numeric[B]): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
  94. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  95. def take(n: Int): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  96. def takeWhile(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  97. val targetSize: Long
  98. def to[Col[_]](implicit cbf: CanBuildFrom[Nothing, ColumnarBatch, Col[ColumnarBatch]]): Col[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  99. def toArray[B >: ColumnarBatch](implicit arg0: ClassTag[B]): Array[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  100. def toBuffer[B >: ColumnarBatch]: Buffer[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  101. def toIndexedSeq: IndexedSeq[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  102. def toIterable: Iterable[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  103. def toIterator: Iterator[ColumnarBatch]
    Definition Classes
    Iterator → GenTraversableOnce
  104. def toList: List[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  105. def toMap[T, U](implicit ev: <:<[ColumnarBatch, (T, U)]): Map[T, U]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  106. def toSeq: Seq[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  107. def toSet[B >: ColumnarBatch]: Set[B]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  108. def toStream: Stream[ColumnarBatch]
    Definition Classes
    Iterator → GenTraversableOnce
  109. def toString(): String
    Definition Classes
    Iterator → AnyRef → Any
  110. def toTraversable: Traversable[ColumnarBatch]
    Definition Classes
    Iterator → TraversableOnce → GenTraversableOnce
  111. def toVector: Vector[ColumnarBatch]
    Definition Classes
    TraversableOnce → GenTraversableOnce
  112. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  113. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  114. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  115. def withFilter(p: (ColumnarBatch) ⇒ Boolean): Iterator[ColumnarBatch]
    Definition Classes
    Iterator
  116. def zip[B](that: Iterator[B]): Iterator[(ColumnarBatch, B)]
    Definition Classes
    Iterator
  117. def zipAll[B, A1 >: ColumnarBatch, B1 >: B](that: Iterator[B], thisElem: A1, thatElem: B1): Iterator[(A1, B1)]
    Definition Classes
    Iterator
  118. def zipWithIndex: Iterator[(ColumnarBatch, Int)]
    Definition Classes
    Iterator

Deprecated Value Members

  1. def /:[B](z: B)(op: (B, ColumnarBatch) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
    Annotations
    @deprecated
    Deprecated

    (Since version 2.12.10) Use foldLeft instead of /:

  2. def :\[B](z: B)(op: (ColumnarBatch, B) ⇒ B): B
    Definition Classes
    TraversableOnce → GenTraversableOnce
    Annotations
    @deprecated
    Deprecated

    (Since version 2.12.10) Use foldRight instead of :\

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AutoCloseable

Inherited from Iterator[ColumnarBatch]

Inherited from TraversableOnce[ColumnarBatch]

Inherited from GenTraversableOnce[ColumnarBatch]

Inherited from AnyRef

Inherited from Any

Ungrouped