org.bdgenomics.adam.rdd.read

AlignmentRecordRDDFunctions

class AlignmentRecordRDDFunctions extends ADAMSequenceDictionaryRDDAggregator[AlignmentRecord]

Linear Supertypes
ADAMSequenceDictionaryRDDAggregator[AlignmentRecord], Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. AlignmentRecordRDDFunctions
  2. ADAMSequenceDictionaryRDDAggregator
  3. Logging
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new AlignmentRecordRDDFunctions(rdd: RDD[AlignmentRecord])

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def adamAlignedRecordSave(args: SaveArgs): Boolean

  7. def adamBQSR(knownSnps: Broadcast[SnpTable], observationDumpFile: Option[String] = None): RDD[AlignmentRecord]

    Runs base quality score recalibration on a set of reads.

    Runs base quality score recalibration on a set of reads. Uses a table of known SNPs to mask true variation during the recalibration process.

    knownSnps

    A table of known SNPs to mask valid variants.

    observationDumpFile

    An optional local path to dump recalibration observations to.

    returns

    Returns an RDD of recalibrated reads.

  8. def adamCharacterizeTagValues(tag: String): Map[Any, Long]

    Calculates the set of unique attribute values that occur for the given tag, and the number of time each value occurs.

    Calculates the set of unique attribute values that occur for the given tag, and the number of time each value occurs.

    tag

    The name of the optional field whose values are to be counted.

    returns

    A Map whose keys are the values of the tag, and whose values are the number of time each tag-value occurs.

  9. def adamCharacterizeTags(): RDD[(String, Long)]

    Converts a set of records into an RDD containing the pairs of all unique tagStrings within the records, along with the count (number of records) which have that particular attribute.

    Converts a set of records into an RDD containing the pairs of all unique tagStrings within the records, along with the count (number of records) which have that particular attribute.

    returns

    An RDD of attribute name / count pairs.

  10. def adamConvertToSAM(): (RDD[SAMRecordWritable], SAMFileHeader)

    Converts an RDD of ADAM read records into SAM records.

    Converts an RDD of ADAM read records into SAM records.

    returns

    Returns a SAM/BAM formatted RDD of reads, as well as the file header.

  11. def adamCountKmers(kmerLength: Int): RDD[(String, Long)]

    Cuts reads into _k_-mers, and then counts the number of occurrences of each _k_-mer.

    Cuts reads into _k_-mers, and then counts the number of occurrences of each _k_-mer.

    kmerLength

    The value of _k_ to use for cutting _k_-mers.

    returns

    Returns an RDD containing k-mer/count pairs.

    See also

    adamCountQmers

  12. def adamFilterRecordsWithTag(tagName: String): RDD[AlignmentRecord]

    Returns the subset of the ADAMRecords which have an attribute with the given name.

    Returns the subset of the ADAMRecords which have an attribute with the given name.

    tagName

    The name of the attribute to filter on (should be length 2)

    returns

    An RDD[Read] containing the subset of records with a tag that matches the given name.

  13. def adamFlagStat(): (FlagStatMetrics, FlagStatMetrics)

  14. def adamGetReadGroupDictionary(): RecordGroupDictionary

    Collects a dictionary summarizing the read groups in an RDD of ADAMRecords.

    Collects a dictionary summarizing the read groups in an RDD of ADAMRecords.

    returns

    A dictionary describing the read groups in this RDD.

  15. def adamGetSequenceDictionary(): SequenceDictionary

    Aggregates together a sequence dictionary from the different individual reference sequences used in this dataset.

    Aggregates together a sequence dictionary from the different individual reference sequences used in this dataset.

    returns

    A sequence dictionary describing the reference contigs in this dataset.

    Definition Classes
    ADAMSequenceDictionaryRDDAggregator
  16. def adamMarkDuplicates(): RDD[AlignmentRecord]

  17. def adamRePairReads(secondPairRdd: RDD[AlignmentRecord], validationStringency: ValidationStringency = ValidationStringency.LENIENT): RDD[AlignmentRecord]

    Reassembles read pairs from two sets of unpaired reads.

    Reassembles read pairs from two sets of unpaired reads. The assumption is that the two sets were _originally_ paired together.

    secondPairRdd

    The rdd containing the second read from the pairs.

    validationStringency

    How stringently to validate the reads.

    returns

    Returns an RDD with the pair information recomputed.

    Note

    The RDD that this is called on should be the RDD with the first read from the pair.

  18. def adamRealignIndels(consensusModel: ConsensusGenerator = new ConsensusGeneratorFromReads, isSorted: Boolean = false, maxIndelSize: Int = 500, maxConsensusNumber: Int = 30, lodThreshold: Double = 5.0, maxTargetSize: Int = 3000): RDD[AlignmentRecord]

    Realigns indels using a concensus-based heuristic.

    Realigns indels using a concensus-based heuristic.

    isSorted

    If the input data is sorted, setting this parameter to true avoids a second sort.

    maxIndelSize

    The size of the largest indel to use for realignment.

    maxConsensusNumber

    The maximum number of consensus sequences to realign against per target region.

    lodThreshold

    Log-odds threhold to use when realigning; realignments are only finalized if the log-odds threshold is exceeded.

    maxTargetSize

    The maximum width of a single target region for realignment.

    returns

    Returns an RDD of mapped reads which have been realigned.

    See also

    RealignIndels

  19. def adamSAMSave(filePath: String, asSam: Boolean = true): Unit

    Saves an RDD of ADAM read data into the SAM/BAM format.

    Saves an RDD of ADAM read data into the SAM/BAM format.

    filePath

    Path to save files to.

    asSam

    Selects whether to save as SAM or BAM. The default value is true (save in SAM format).

  20. def adamSAMString: String

  21. def adamSave(args: ADAMSaveAnyArgs): Boolean

  22. def adamSaveAsFastq(fileName: String, fileName2Opt: Option[String] = None, sort: Boolean = false, validationStringency: ValidationStringency = ValidationStringency.LENIENT, persistLevel: Option[StorageLevel] = None): Unit

    Saves reads in FASTQ format.

    Saves reads in FASTQ format.

    fileName

    Path to save files at.

    sort

    Whether to sort the FASTQ files by read name or not. Defaults to false. Sorting the output will recover pair order, if desired.

  23. def adamSaveAsPairedFastq(fileName1: String, fileName2: String, validationStringency: ValidationStringency = ValidationStringency.LENIENT, persistLevel: Option[StorageLevel] = None): Unit

    Saves these AlignmentRecords to two FASTQ files: one for the first mate in each pair, and the other for the second.

    Saves these AlignmentRecords to two FASTQ files: one for the first mate in each pair, and the other for the second.

    fileName1

    Path at which to save a FASTQ file containing the first mate of each pair.

    fileName2

    Path at which to save a FASTQ file containing the second mate of each pair.

    validationStringency

    Iff strict, throw an exception if any read in this RDD is not accompanied by its mate.

  24. def adamSingleReadBuckets(): RDD[SingleReadBucket]

    Groups all reads by record group and read name

    Groups all reads by record group and read name

    returns

    SingleReadBuckets with primary, secondary and unmapped reads

  25. def adamSortReadsByReferencePosition(): RDD[AlignmentRecord]

  26. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  27. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  29. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  30. def filterByOverlappingRegion(query: ReferenceRegion): RDD[AlignmentRecord]

    Calculates the subset of the RDD whose AlignmentRecords overlap the corresponding query ReferenceRegion.

    Calculates the subset of the RDD whose AlignmentRecords overlap the corresponding query ReferenceRegion. Equality of the reference sequence (to which these are aligned) is tested by string equality of the names. AlignmentRecords whose 'getReadMapped' method return 'false' are ignored.

    The end of the record against the reference sequence is calculated from the cigar string using the ADAMContext.referenceLengthFromCigar method.

    query

    The query region, only records which overlap this region are returned.

    returns

    The subset of AlignmentRecords (corresponding to either primary or secondary alignments) that overlap the query region.

  31. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  32. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  33. def getSequenceRecordsFromElement(elem: AlignmentRecord): Set[SequenceRecord]

    For a single RDD element, returns 0+ sequence record elements.

    For a single RDD element, returns 0+ sequence record elements.

    elem

    Element from which to extract sequence records.

    returns

    A seq of sequence records.

    Definition Classes
    AlignmentRecordRDDFunctionsADAMSequenceDictionaryRDDAggregator
  34. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  35. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  36. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  37. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  38. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  39. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  40. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  41. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  42. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  43. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  44. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  45. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  46. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  47. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  48. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  49. def maybeSaveBam(args: SaveArgs): Boolean

  50. def maybeSaveFastq(args: ADAMSaveAnyArgs): Boolean

  51. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  52. final def notify(): Unit

    Definition Classes
    AnyRef
  53. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  54. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  55. def toString(): String

    Definition Classes
    AnyRef → Any
  56. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  57. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  58. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from ADAMSequenceDictionaryRDDAggregator[AlignmentRecord]

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped