abstract class GpuArrowPythonWriter extends GpuArrowWriter
- Alphabetic
- By Inheritance
- GpuArrowPythonWriter
- GpuArrowWriter
- AutoCloseable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Abstract Value Members
-
abstract
def
writeUDFs(dataOut: DataOutputStream): Unit
- Attributes
- protected
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
close(): Unit
- Definition Classes
- GpuArrowWriter → AutoCloseable
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
inputSchema: StructType
- Definition Classes
- GpuArrowPythonWriter → GpuArrowWriter
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
maxBatchSize: Long
- Definition Classes
- GpuArrowPythonWriter → GpuArrowWriter
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
reset(): Unit
This is design to reuse the writer options
This is design to reuse the writer options
- Definition Classes
- GpuArrowWriter
-
final
def
start(dataOut: DataOutputStream): Unit
Make the writer ready to write data, should be called before writing any batch
Make the writer ready to write data, should be called before writing any batch
- Definition Classes
- GpuArrowWriter
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
write(batch: ColumnarBatch): Unit
- Definition Classes
- GpuArrowWriter
-
final
def
writeAndClose(batch: ColumnarBatch): Unit
- Definition Classes
- GpuArrowWriter
- def writeCommand(dataOut: DataOutputStream, confs: Map[String, String]): Unit
-
final
def
writeEmptyIteratorOnCpu(dataOut: DataOutputStream, arrowSchema: Schema): Unit
This is for writing the empty partition.
This is for writing the empty partition. In this case CPU will still send the schema to Python workers by calling the "start" API of the Java Arrow writer, but GPU will send out nothing, leading to the IPC error. And it is not easy to do as what Spark does on GPU, because the C++ Arrow writer used by GPU will only send out the schema iff there is some data. Besides, it does not expose a "start" API to do this. So here we leverage the Java Arrow writer to do similar things as Spark. It is OK because sending out schema has nothing to do with GPU. (Most code is copied from Spark)