object functions
Functions provided by Glow. These functions can be used with Spark's DataFrame API.
- Alphabetic
- By Inheritance
- functions
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
add_struct_fields(struct: Column, fields: Column*): Column
Adds fields to a struct.
Adds fields to a struct.
- struct
The struct to which fields will be added
- fields
The new fields to add. The arguments must alternate between string-typed literal field names and field values.
- returns
A struct consisting of the input struct and the added fields
- Since
0.3.0
- def aggregate_by_index(arr: Column, initialValue: Column, update: (Column, Column) ⇒ Column, merge: (Column, Column) ⇒ Column): Column
-
def
aggregate_by_index(arr: Column, initialValue: Column, update: (Column, Column) ⇒ Column, merge: (Column, Column) ⇒ Column, evaluate: (Column) ⇒ Column): Column
Computes custom per-sample aggregates.
Computes custom per-sample aggregates.
- arr
array of values.
- initialValue
the initial value
- update
update function
- merge
merge function
- evaluate
evaluate function
- returns
An array of aggregated values. The number of elements in the array is equal to the number of samples.
- Since
0.3.0
-
def
array_summary_stats(arr: Column): Column
Computes the minimum, maximum, mean, standard deviation for an array of numerics.
Computes the minimum, maximum, mean, standard deviation for an array of numerics.
- arr
An array of any numeric type
- returns
A struct containing double
,mean,stdDev, andminfieldsmax
- Since
0.3.0
-
def
array_to_dense_vector(arr: Column): Column
Converts an array of numerics into a
spark.ml.DenseVectorConverts an array of numerics into a
spark.ml.DenseVector- arr
The array of numerics
- returns
A
spark.mlDenseVector
- Since
0.3.0
-
def
array_to_sparse_vector(arr: Column): Column
Converts an array of numerics into a
spark.ml.SparseVectorConverts an array of numerics into a
spark.ml.SparseVector- arr
The array of numerics
- returns
A
spark.mlSparseVector
- Since
0.3.0
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
call_summary_stats(genotypes: Column): Column
Computes call summary statistics for an array of genotype structs.
Computes call summary statistics for an array of genotype structs. See :ref:
variant-qcfor more details.- genotypes
The array of genotype structs with
fieldcalls- returns
A struct containing
,callRate,nCalled,nUncalled,nHet,nHomozygous,nNonRef,nAllelesCalled,alleleCountsfields. See :ref:alleleFrequenciesvariant-qc.
- Since
0.3.0
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
def
dp_summary_stats(genotypes: Column): Column
Computes summary statistics for the depth field from an array of genotype structs.
Computes summary statistics for the depth field from an array of genotype structs. See :ref:
variant-qc.- genotypes
An array of genotype structs with
fielddepth- returns
A struct containing
,mean,stdDev, andminof genotype depthsmax
- Since
0.3.0
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
expand_struct(struct: Column): Column
Promotes fields of a nested struct to top-level columns similar to using
from SQL, but can be used in more contexts.struct.*Promotes fields of a nested struct to top-level columns similar to using
from SQL, but can be used in more contexts.struct.*- struct
The struct to expand
- returns
Columns corresponding to fields of the input struct
- Since
0.3.0
-
def
explode_matrix(matrix: Column): Column
Explodes a
spark.ml(sparse or dense) into multiple arrays, one per row of the matrix.MatrixExplodes a
spark.ml(sparse or dense) into multiple arrays, one per row of the matrix.Matrix- matrix
The
sparl.mlto explodeMatrix- returns
An array column in which each row is a row of the input matrix
- Since
0.3.0
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
genotype_states(genotypes: Column): Column
Gets the number of alternate alleles for an array of genotype structs.
Gets the number of alternate alleles for an array of genotype structs. Returns
if there are any-1s (no-calls) in the calls array.-1- genotypes
An array of genotype structs with
fieldcalls- returns
An array of integers containing the number of alternate alleles in each call array
- Since
0.3.0
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
gq_summary_stats(genotypes: Column): Column
Computes summary statistics about the genotype quality field for an array of genotype structs.
Computes summary statistics about the genotype quality field for an array of genotype structs. See :ref:
variant-qc.- genotypes
The array of genotype structs with
fieldconditionalQuality- returns
A struct containing
,mean,stdDev, andminof genotype qualitiesmax
- Since
0.3.0
- def hard_calls(probabilities: Column, numAlts: Column, phased: Column): Column
-
def
hard_calls(probabilities: Column, numAlts: Column, phased: Column, threshold: Double): Column
Converts an array of probabilities to hard calls.
Converts an array of probabilities to hard calls. The probabilities are assumed to be diploid. See :ref:
variant-data-transformationsfor more details.- probabilities
The array of probabilities to convert
- numAlts
The number of alternate alleles
- phased
Whether the probabilities are phased. If phased, we expect one
values in the probabilities array. If unphased, we expect one probability per possible genotype.2 * numAlts- threshold
The minimum probability to make a call. If no probability falls into the range of
or[0, 1 - threshold], a no-call (represented by[threshold, 1]s) will be emitted. If not provided, this parameter defaults to-1.0.9- returns
An array of hard calls
- Since
0.3.0
-
def
hardy_weinberg(genotypes: Column): Column
Computes statistics relating to the Hardy Weinberg equilibrium.
Computes statistics relating to the Hardy Weinberg equilibrium. See :ref:
variant-qcfor more details.- genotypes
The array of genotype structs with
fieldcalls- returns
A struct containing two fields,
(the expected heterozygous frequency according to Hardy-Weinberg equilibrium) andhetFreqHwe(the associated p-value)pValueHwe
- Since
0.3.0
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def lift_over_coordinates(contigName: Column, start: Column, end: Column, chainFile: String): Column
-
def
lift_over_coordinates(contigName: Column, start: Column, end: Column, chainFile: String, minMatchRatio: Double): Column
Performs liftover for the coordinates of a variant.
Performs liftover for the coordinates of a variant. To perform liftover of alleles and add additional metadata, see :ref:
liftover.- contigName
The current contig name
- start
The current start
- end
The current end
- chainFile
Location of the chain file on each node in the cluster
- minMatchRatio
Minimum fraction of bases that must remap to do liftover successfully. If not provided, defaults to
.0.95- returns
A struct containing
,contigName, andstartfields after liftoverend
- Since
0.3.0
-
def
linear_regression_gwas(genotypes: Column, phenotypes: Column, covariates: Column): Column
Performs a linear regression association test optimized for performance in a GWAS setting.
Performs a linear regression association test optimized for performance in a GWAS setting. See :ref:
linear-regressionfor details.- genotypes
A numeric array of genotypes
- phenotypes
A numeric array of phenotypes
- covariates
A
spark.mlof covariatesMatrix- returns
A struct containing
,beta, andstandardErrorfields. See :ref:pValuelinear-regression.
- Since
0.3.0
-
def
logistic_regression_gwas(genotypes: Column, phenotypes: Column, covariates: Column, test: String): Column
Performs a logistic regression association test optimized for performance in a GWAS setting.
Performs a logistic regression association test optimized for performance in a GWAS setting. See :ref:
logistic-regressionfor more details.- genotypes
An numeric array of genotypes
- phenotypes
A double array of phenotype values
- covariates
A
spark.mlof covariatesMatrix- test
Which logistic regression test to use. Can be
orLRTFirth- returns
A struct containing
,beta,oddsRatio, andwaldConfidenceIntervalfields. See :ref:pValuelogistic-regression.
- Since
0.3.0
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
normalize_variant(contigName: Column, start: Column, end: Column, refAllele: Column, altAlleles: Column, refGenomePathString: String): Column
Normalizes the variant with a behavior similar to vt normalize or bcftools norm.
Normalizes the variant with a behavior similar to vt normalize or bcftools norm. Creates a StructType column including the normalized
,start,endandreferenceAllelefields (whether they are changed or unchanged as the result of normalization) as well as a StructType field calledalternateAllelesthat contains the following fields:normalizationStatus: A boolean field indicating whether the variant data was changed as a result of normalizationchanged: An error message in case the attempt at normalizing the row hit an error. In this case, theerrorMessagefield will be set tochanged. If no errors occur, this field will befalse.nullIn case of an error, the
,start,endandreferenceAllelefields in the generated struct will bealternateAlleles.null- contigName
The current contig name
- start
The current start
- end
The current end
- refAllele
The current reference allele
- altAlleles
The current array of alternate alleles
- refGenomePathString
A path to the reference genome
file. The.fastafile must be accompanied with a.fastaindex file in the same folder..fai- returns
A struct as explained above
- Since
0.3.0
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
sample_call_summary_stats(genotypes: Column, refAllele: Column, alternateAlleles: Column): Column
Computes per-sample call summary statistics.
Computes per-sample call summary statistics. See :ref:
sample-qcfor more details.- genotypes
An array of genotype structs with
fieldcalls- refAllele
The reference allele
- alternateAlleles
An array of alternate alleles
- returns
A struct containing
,sampleId,callRate,nCalled,nUncalled,nHomRef,nHet,nHomVar,nSnp,nInsertion,nDeletion,nTransition,nTransversion,nSpanningDeletion,rTiTv,rInsertionDeletionfields. See :ref:rHetHomVarsample-qc.
- Since
0.3.0
-
def
sample_dp_summary_stats(genotypes: Column): Column
Computes per-sample summary statistics about the depth field in an array of genotype structs.
Computes per-sample summary statistics about the depth field in an array of genotype structs.
- genotypes
An array of genotype structs with
fielddepth- returns
An array of structs where each struct contains
,mean,stDev, andminof the genotype depths for a sample. Ifmaxis present in a genotype, it will be propagated to the resulting struct as an extra field.sampleId
- Since
0.3.0
-
def
sample_gq_summary_stats(genotypes: Column): Column
Computes per-sample summary statistics about the genotype quality field in an array of genotype structs.
Computes per-sample summary statistics about the genotype quality field in an array of genotype structs.
- genotypes
An array of genotype structs with
fieldconditionalQuality- returns
An array of structs where each struct contains
,mean,stDev, andminof the genotype qualities for a sample. Ifmaxis present in a genotype, it will be propagated to the resulting struct as an extra field.sampleId
- Since
0.3.0
-
def
subset_struct(struct: Column, fields: String*): Column
Selects fields from a struct.
Selects fields from a struct.
- struct
Struct from which to select fields
- fields
Fields to select
- returns
A struct containing only the indicated fields
- Since
0.3.0
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
vector_to_array(vector: Column): Column
Converts a
spark.ml(sparse or dense) to an array of doubles.VectorConverts a
spark.ml(sparse or dense) to an array of doubles.Vector- vector
Vector to convert
- returns
An array of doubles
- Since
0.3.0
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )