class SerializeConcatHostBuffersDeserializeBatch extends Serializable with Logging
Class that is used to broadcast results (a contiguous host batch) to executors.
This is instantiated in the driver, serialized to an output stream provided by Spark
to broadcast, and deserialized on the executor. Both the driver's and executor's copies
are cleaned via GC. Because Spark closes AutoCloseable broadcast results after spilling
to disk, this class does not subclass AutoCloseable. Instead we implement a closeInternal
method only to be triggered via GC.
- Annotations
- @SerialVersionUID()
- Alphabetic
- By Inheritance
- SerializeConcatHostBuffersDeserializeBatch
- Logging
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
SerializeConcatHostBuffersDeserializeBatch(data: HostConcatResult, output: Seq[Attribute], numRows: Int, dataLen: Long)
- data
HostConcatResult populated for a broadcast that has column, otherwise it is null. It is transient because we want the executor to deserialize its
datafrom Spark's torrent-backed input stream.- output
used to find the schema for this broadcast batch
- numRows
number of rows for this broadcast batch
- dataLen
size in bytes for this broadcast batch
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
- def batch: SpillableColumnarBatch
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
closeInternal(): Unit
This method is meant to only be called from
finalizeand it is not a regular AutoCloseable.close because we do not want Spark to closebatchInternalwhen it spills the broadcast block's host torrent data.This method is meant to only be called from
finalizeand it is not a regular AutoCloseable.close because we do not want Spark to closebatchInternalwhen it spills the broadcast block's host torrent data.Reference: https://github.com/NVIDIA/spark-rapids/issues/8602
Public for tests.
- var data: HostConcatResult
- var dataLen: Long
- def dataSize: Long
-
def
doReadObject(in: ObjectInputStream): Unit
Deserializes a broadcast result in the host into
data,numRowsanddataLen.Deserializes a broadcast result in the host into
data,numRowsanddataLen.Public for unit tests.
-
def
doWriteObject(out: ObjectOutputStream): Unit
doWriteObject is invoked from both the driver, when it is trying to write a collected broadcast result on an stream to torrent broadcast to executors, and also when the executor MemoryStore evicts a "broadcast_[id]" block to make room in host memory.
doWriteObject is invoked from both the driver, when it is trying to write a collected broadcast result on an stream to torrent broadcast to executors, and also when the executor MemoryStore evicts a "broadcast_[id]" block to make room in host memory.
The driver will have
datapopulated on construction and the executor will deserialize the object and, as part of the deserialization, invokedoReadObject. This will populatedatabefore any task has had a chance to call.batchon this class.If
batchInternalis defined we are in the executor, and there is no work to be done. This broadcast has been materialized on the GPU/RapidsBufferCatalog, and it is completely managed by the plugin.Public for unit tests.
- out
the stream to write to
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Definition Classes
- SerializeConcatHostBuffersDeserializeBatch → AnyRef
- Annotations
- @nowarn()
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hostBatch: ColumnarBatch
Create host columnar batches from either serialized buffers or device columnar batch.
Create host columnar batches from either serialized buffers or device columnar batch. This method can be safely called in both driver node and executor nodes. For now, it is used on the driver side for reusing GPU broadcast results in the CPU.
NOTE: The caller is responsible to release these host columnar batches.
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- var numRows: Int
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()