com.twitter

scalding

package scalding

Visibility
  1. Public
  2. All
Impl.
  1. Concrete
  2. Abstract

Type Members

  1. trait AbsoluteDuration extends Duration with Ordered[AbsoluteDuration]

  2. case class AbsoluteDurationList (parts: List[AbsoluteDuration]) extends AbstractDurationList[AbsoluteDuration] with AbsoluteDuration with Product

  3. class AbstractDurationList [T <: Duration] extends Duration

    attributes: abstract
  4. class AccessMode extends AnyRef

    attributes: sealed abstract
  5. class Args extends AnyRef

  6. class BaseGlobifier extends AnyRef

  7. class BufferOp [I, T, X] extends BaseOperation[Any] with Buffer[Any]

  8. trait CascadingLocal extends Mode

  9. trait CaseClassPackers extends LowPriorityTuplePackers

  10. class CoGroupBuilder extends GroupBuilder

    Builder classes used internally to implement coGroups (joins).

  11. class CoGrouped2 [K, V, W, Result] extends KeyedList[K, Result] with Serializable

    Represents a result of CoGroup operation on two Grouped pipes.

  12. case class DateRange (start: RichDate, end: RichDate) extends Product

    represents a closed interval of time.

  13. class DateRangeSerializer extends Serializer[DateRange]

  14. case class DayGlob (pat: String, tz: TimeZone) extends BaseGlobifier with Product

  15. case class Days (cnt: Int, tz: TimeZone) extends Duration with Product

  16. trait DefaultDateRangeJob extends Job

    Sets up an implicit dateRange to use in your sources and an implicit timezone.

  17. trait DelimitedScheme extends Source

    Mix this in for delimited schemes such as TSV or one-separated values By default, TSV is given

  18. class Duration extends AnyRef

    attributes: abstract
  19. case class DurationList (parts: List[Duration]) extends AbstractDurationList[Duration] with Product

  20. class ExtremumAggregator extends BaseOperation[Tuple] with Aggregator[Tuple]

  21. class ExtremumBy extends AggregateBy

  22. class ExtremumFunctor extends Functor

  23. trait FieldConversions extends LowPriorityFieldConversions

  24. class FileSource extends Source

    This is a base class for File-based sources

  25. class FilterFunction [T] extends BaseOperation[Any] with Filter[Any]

  26. class FixedPathSource extends FileSource

    attributes: abstract
  27. class FlatMapFunction [S, T] extends BaseOperation[Any] with Function[Any]

  28. class FoldAggregator [T, X] extends BaseOperation[X] with Aggregator[X]

  29. class FoldFunctor [X] extends Functor

    This handles the mapReduceMap work on the map-side of the operation.

  30. trait GeneratedConversions extends LowPriorityConversions

  31. case class Globifier (pat: String, tz: TimeZone) extends BaseGlobifier with Product

  32. class GroupBuilder extends Serializable

  33. class Grouped [K, T] extends KeyedList[K, T] with Serializable

    Represents a grouping which is the transition from map to reduce phase in hadoop.

  34. trait HadoopMode extends Mode

  35. case class HadoopTest (config: Configuration, buffers: Map[Source, Buffer[Tuple]]) extends Mode with HadoopMode with TestMode with Product

  36. case class Hdfs (strict: Boolean, config: Configuration) extends Mode with HadoopMode with Product

  37. case class HourGlob (pat: String, tz: TimeZone) extends BaseGlobifier with Product

  38. case class Hours (cnt: Int) extends Duration with AbsoluteDuration with Product

  39. class InnerCoGrouped2 [K, V, W] extends CoGrouped2[K, V, W, (V, W)]

  40. class IntegralComparator extends Comparator[AnyRef] with Hasher[AnyRef] with Serializable

  41. class InvalidJoinModeException extends Exception

  42. class InvalidSourceException extends RuntimeException

    thrown when validateTaps fails

  43. case class IterableSource [T] (iter: Iterable[T], inFields: Fields, set: TupleSetter[T]) extends Source with Mappable[T] with Product

    Allows working with an iterable object defined in the job (on the submitter) to be used within a Job as you would a Pipe/RichPipe

  44. class Job extends TupleConversions with FieldConversions

  45. class JobTest extends TupleConversions

    This class is used to construct unit tests for scalding jobs.

  46. trait JoinAlgorithms extends AnyRef

  47. class JoinMode extends AnyRef

    attributes: sealed abstract
  48. trait KeyedList [K, T] extends AnyRef

    Represents sharded lists of items of type T

  49. class KryoHadoopSerialization extends KryoSerialization

  50. class LeftCoGrouped2 [K, V, W] extends CoGrouped2[K, V, W, (V, Option[W])]

  51. class ListSerializer [T <: List[_]] extends Serializer[T]

  52. case class Local (strict: Boolean) extends Mode with CascadingLocal with Product

  53. trait LowPriorityConversions extends AnyRef

  54. trait LowPriorityFieldConversions extends AnyRef

  55. trait LowPriorityTuplePackers extends TupleConversions

  56. trait LowPriorityTupleUnpackers extends TupleConversions

  57. class LtOrdering [T] extends Ordering[T] with Serializable

  58. class MRMAggregator [T, X, U] extends BaseOperation[Tuple] with Aggregator[Tuple]

  59. class MRMBy [T, X, U] extends AggregateBy

    MapReduceMapBy Class

  60. class MRMFunctor [T, X] extends FoldFunctor[X]

    This handles the mapReduceMap work on the map-side of the operation.

  61. class MapFunction [S, T] extends BaseOperation[Any] with Function[Any]

  62. class MapSerializer [T <: Map[_, _]] extends Serializer[T]

  63. trait Mappable [T] extends Source

    Usually as soon as we open a source, we read and do some mapping operation on a single column or set of columns.

  64. class MappedOrdering [B, T] extends Ordering[T] with Serializable

  65. class MemoryTap [In, Out] extends Tap[Properties, In, Out]

  66. class MemoryTupleEntryCollector extends TupleEntryCollector

  67. case class Millisecs (cnt: Int) extends Duration with AbsoluteDuration with Product

  68. case class Minutes (cnt: Int) extends Duration with AbsoluteDuration with Product

  69. class Mode extends AnyRef

    There are three ways to run jobs sourceStrictness is set to true

  70. case class MonthGlob (pat: String, tz: TimeZone) extends BaseGlobifier with Product

  71. case class Months (cnt: Int, tz: TimeZone) extends Duration with Product

  72. class MostRecentGoodSource extends TimePathedSource

    attributes: abstract
  73. class OrderedConstructorConverter [T] extends TupleConverter[T]

  74. class OrderedTuplePacker [T] extends TuplePacker[T]

    This just blindly uses the first public constructor with the same arity as the fields size

  75. case class Osv (p: String, f: Fields) extends FixedPathSource with DelimitedScheme with Product

    One separated value (commonly used by Pig)

  76. class OuterCoGrouped2 [K, V, W] extends CoGrouped2[K, V, W, (Option[V], Option[W])]

  77. class PipeTExtensions extends Serializable

  78. class ReflectionSetter [T] extends TupleSetter[T]

  79. class ReflectionTupleConverter [T] extends TupleConverter[T]

  80. class ReflectionTuplePacker [T] extends TuplePacker[T]

    Packs a tuple into any object with set methods, e.

  81. class ReflectionTupleUnpacker [T] extends TupleUnpacker[T]

  82. case class RichDate (value: Date) extends Ordered[RichDate] with Product

  83. class RichDateSerializer extends Serializer[RichDate]

    * Below are some serializers for objects in the scalding project.

  84. class RichPipe extends Serializable with JoinAlgorithms

  85. class RightCoGrouped2 [K, V, W] extends CoGrouped2[K, V, W, (Option[V], W)]

  86. class ScaldingMultiSourceTap extends MultiSourceTap[Tap[JobConf, org.apache.hadoop.mapred.RecordReader[_, _], org.apache.hadoop.mapred.OutputCollector[_, _]], JobConf, org.apache.hadoop.mapred.RecordReader[_, _]]

  87. class ScanLeftIterator [T, U] extends Iterator[U] with Serializable

    Scala 2.

  88. class ScriptJob extends Job

  89. case class Seconds (cnt: Int) extends Duration with AbsoluteDuration with Product

  90. case class SequenceFile (p: String, f: Fields) extends FixedPathSource with SequenceFileScheme with Product

  91. trait SequenceFileScheme extends Source

  92. class SingletonSerializer [T] extends Serializer[T]

  93. class Source extends Serializable

    Every source must have a correct toString method.

  94. case class Test (buffers: Map[Source, Buffer[Tuple]]) extends Mode with TestMode with CascadingLocal with Product

    Memory only testing for unit tests

  95. trait TestMode extends Mode

  96. case class TextLine (p: String) extends FixedPathSource with TextLineScheme with Product

  97. trait TextLineScheme extends Source with Mappable[String]

    The fields here are ('offset, 'line)

  98. class TimePathedSource extends FileSource

    This will automatically produce a globbed version of the given path.

  99. class Tool extends Configured with Tool

  100. case class Tsv (p: String, f: Fields, sh: Boolean, wh: Boolean) extends FixedPathSource with DelimitedScheme with Product

    Tab separated value source

  101. trait TupleArity extends AnyRef

    Mixed in to both TupleConverter and TupleSetter to improve arity safety of cascading jobs before we run anything on Hadoop.

  102. trait TupleConversions extends GeneratedConversions

  103. class TupleConverter [T] extends Serializable with TupleArity

    attributes: abstract
  104. class TupleGetter [T] extends Serializable

    attributes: abstract
  105. class TuplePacker [T] extends Serializable

    attributes: abstract
  106. class TupleSetter [-T] extends Serializable with TupleArity

    attributes: abstract
  107. class TupleUnpacker [T] extends Serializable

    attributes: abstract
  108. class TupleUnpackerException extends Exception

  109. class TypedPipe [T] extends Serializable

    Represents a phase in a distributed computation on an input data source Wraps a cascading Pipe object, and holds the transformation done up until that point

  110. trait UtcDateRangeJob extends Job with DefaultDateRangeJob

  111. class VectorSerializer [T] extends Serializer[Vector[T]]

  112. case class Weeks (cnt: Int, tz: TimeZone) extends Duration with Product

  113. case class Years (cnt: Int, tz: TimeZone) extends Duration with Product

Value Members

  1. object AbsoluteDuration extends AnyRef

  2. object Args extends AnyRef

    The args class does a simple command line parsing.

  3. object CascadingUtils extends AnyRef

  4. object CommonReduceFunctions extends Serializable

  5. object DateOps extends AnyRef

    Holds some coversion functions for dealing with strings as RichDate objects

  6. object Dsl extends FieldConversions with TupleConversions

    This object has all the implicit functions and values that are used to make the scalding DSL.

  7. object Duration extends AnyRef

    Represents millisecond based duration (non-calendar based): seconds, minutes, hours calField should be a java.

  8. object Grouped extends AnyRef

  9. object InnerJoinMode extends JoinMode with Product

  10. object Job extends AnyRef

  11. object JobTest extends AnyRef

  12. object Mode extends AnyRef

  13. object OuterJoinMode extends JoinMode with Product

  14. object Read extends AccessMode with Product

  15. object RichDate extends AnyRef

    RichDate adds some nice convenience functions to the Java date/calendar classes We commonly do Date/Time work in analysis jobs, so having these operations convenient is very helpful.

  16. object RichPipe extends Serializable

  17. object TDsl extends Serializable

    implicits for the type-safe DSL import TDsl.

  18. object TimePathedSource extends AnyRef

  19. object Tool extends AnyRef

  20. object TuplePacker extends CaseClassPackers

    Base class for classes which pack a Tuple into a serializable object.

  21. object TupleUnpacker extends LowPriorityTupleUnpackers

    Base class for objects which unpack an object into a tuple.

  22. object TypedPipe extends Serializable

    factory methods for TypedPipe

  23. object Write extends AccessMode with Product

  24. package examples

  25. package mathematics