object VCFSchemaInferrer
Infers the schema of a VCF file from its headers.
- Alphabetic
- By Inheritance
- VCFSchemaInferrer
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val VCF_HEADER_COUNT_KEY: String
- val VCF_HEADER_DESCRIPTION_KEY: String
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def getInfoFieldStruct(headerLine: VCFInfoHeaderLine): StructField
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
headerLinesFromSchema(schema: StructType): Seq[VCFHeaderLine]
Returns the VCF header lines that correspond to a variant schema.
Returns the VCF header lines that correspond to a variant schema. Each flattened info field (those fields whose names start with "INFO_") will be converted to an info header line, and fields from the "genotype" struct will be converted to format header lines.
If the count type is available in the schema metadata (which is always the case if the original schema was generated by
inferSchema), that will be the returned count type. If not, we provide a best guess count type according to the following schema possibilities: - If it's a boolean field, return count = 0, as is the convention for flags - If it's a non-array field, return count = 1 - If it's an array field, return count = UNBOUNDED- schema
The schema of the variant DataFrame
- returns
VCF header lines that can be inferred from the input schema
- def inferGenotypeSchema(includeSampleIds: Boolean, formatHeaders: Seq[VCFFormatHeaderLine]): StructType
- def inferSchema(includeSampleIds: Boolean, flattenInfoFields: Boolean, header: VCFHeader): StructType
-
def
inferSchema(includeSampleIds: Boolean, flattenInfoFields: Boolean, infoHeaders: Seq[VCFInfoHeaderLine], formatHeaders: Seq[VCFFormatHeaderLine]): StructType
- includeSampleIds
If true, a sampleId column will be added to the genotype fields
- flattenInfoFields
If true, each INFO field will be promoted to a column. If false, they will instead be stored in a string -> string map
- returns
A StructType describing the schema
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
- def typesForHeader(line: VCFCompoundHeaderLine): Seq[DataType]
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )