Packages

c

org.apache.flinkx.api

KeyedStream

class KeyedStream[T, K] extends DataStream[T]

Annotations
@Public()
Linear Supertypes
DataStream[T], AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. KeyedStream
  2. DataStream
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new KeyedStream(javaStream: flink.streaming.api.datastream.KeyedStream[T, K])

Type Members

  1. class IntervalJoin[IN1, IN2, KEY] extends AnyRef

    Perform a join over a time interval.

    Perform a join over a time interval.

    IN1

    The type parameter of the elements in the first streams

    IN2

    The type parameter of the elements in the second stream

    Annotations
    @PublicEvolving()
  2. class IntervalJoined[IN1, IN2, KEY] extends AnyRef

    IntervalJoined is a container for two streams that have keys for both sides as well as the time boundaries over which elements should be joined.

    IntervalJoined is a container for two streams that have keys for both sides as well as the time boundaries over which elements should be joined.

    IN1

    Input type of elements from the first stream

    IN2

    Input type of elements from the second stream

    KEY

    The type of the key

    Annotations
    @PublicEvolving()

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def addSink(fun: (T) ⇒ Unit): DataStreamSink[T]

    Adds the given sink to this DataStream.

    Adds the given sink to this DataStream. Only streams with sinks added will be executed once the StreamExecutionEnvironment.execute(...) method is called.

    Definition Classes
    DataStream
  5. def addSink(sinkFunction: SinkFunction[T]): DataStreamSink[T]

    Adds the given sink to this DataStream.

    Adds the given sink to this DataStream. Only streams with sinks added will be executed once the StreamExecutionEnvironment.execute(...) method is called.

    Definition Classes
    DataStream
  6. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  7. def asQueryableState(queryableStateName: String, stateDescriptor: ReducingStateDescriptor[T]): QueryableStateStream[K, T]

    Publishes the keyed stream as a queryable ReducingState instance.

    Publishes the keyed stream as a queryable ReducingState instance.

    queryableStateName

    Name under which to the publish the queryable state instance

    stateDescriptor

    State descriptor to create state instance from

    returns

    Queryable state instance

    Annotations
    @PublicEvolving()
  8. def asQueryableState(queryableStateName: String, stateDescriptor: ValueStateDescriptor[T]): QueryableStateStream[K, T]

    Publishes the keyed stream as a queryable ValueState instance.

    Publishes the keyed stream as a queryable ValueState instance.

    queryableStateName

    Name under which to the publish the queryable state instance

    stateDescriptor

    State descriptor to create state instance from

    returns

    Queryable state instance

    Annotations
    @PublicEvolving()
  9. def asQueryableState(queryableStateName: String): QueryableStateStream[K, T]

    Publishes the keyed stream as a queryable ValueState instance.

    Publishes the keyed stream as a queryable ValueState instance.

    queryableStateName

    Name under which to the publish the queryable state instance

    returns

    Queryable state instance

    Annotations
    @PublicEvolving()
  10. def assignAscendingTimestamps(extractor: (T) ⇒ Long): DataStream[T]

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    Assigns timestamps to the elements in the data stream and periodically creates watermarks to signal event time progress.

    This method is a shortcut for data streams where the element timestamp are known to be monotonously ascending within each parallel stream. In that case, the system can generate watermarks automatically and perfectly by tracking the ascending timestamps.

    For cases where the timestamps are not monotonously increasing, use the more general methods assignTimestampsAndWatermarks(AssignerWithPeriodicWatermarks) and assignTimestampsAndWatermarks(AssignerWithPunctuatedWatermarks).

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  11. def assignTimestampsAndWatermarks(watermarkStrategy: WatermarkStrategy[T]): DataStream[T]

    Assigns timestamps to the elements in the data stream and generates watermarks to signal event time progress.

    Assigns timestamps to the elements in the data stream and generates watermarks to signal event time progress. The given is used to create a TimestampAssigner and org.apache.flink.api.common.eventtime.WatermarkGenerator.

    For each event in the data stream, the long) method is called to assign an event timestamp.

    For each event in the data stream, the long, WatermarkOutput) will be called.

    Periodically (defined by the ExecutionConfig#getAutoWatermarkInterval()), the WatermarkGenerator#onPeriodicEmit(WatermarkOutput) method will be called.

    Common watermark generation patterns can be found as static methods in the org.apache.flink.api.common.eventtime.WatermarkStrategy class.

    Definition Classes
    DataStream
  12. def broadcast(broadcastStateDescriptors: MapStateDescriptor[_, _]*): BroadcastStream[T]

    Sets the partitioning of the DataStream so that the output elements are broadcasted to every parallel instance of the next operation.

    Sets the partitioning of the DataStream so that the output elements are broadcasted to every parallel instance of the next operation. In addition, it implicitly creates as many broadcast states as the specified descriptors which can be used to store the element of the stream.

    broadcastStateDescriptors

    the descriptors of the broadcast states to create.

    returns

    A BroadcastStream which can be used in the DataStream.connect(BroadcastStream) to create a BroadcastConnectedStream for further processing of the elements.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  13. def broadcast: DataStream[T]

    Sets the partitioning of the DataStream so that the output tuples are broad casted to every parallel instance of the next component.

    Sets the partitioning of the DataStream so that the output tuples are broad casted to every parallel instance of the next component.

    Definition Classes
    DataStream
  14. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  15. def coGroup[T2](otherStream: DataStream[T2]): CoGroupedStreams[T, T2]

    Creates a co-group operation.

    Creates a co-group operation. See CoGroupedStreams for an example of how the keys and window can be specified.

    Definition Classes
    DataStream
  16. def connect[R](broadcastStream: BroadcastStream[R]): BroadcastConnectedStream[T, R]

    Creates a new BroadcastConnectedStream by connecting the current DataStream or KeyedStream with a BroadcastStream.

    Creates a new BroadcastConnectedStream by connecting the current DataStream or KeyedStream with a BroadcastStream.

    The latter can be created using the broadcast(MapStateDescriptor[]) method.

    The resulting stream can be further processed using the broadcastConnectedStream.process(myFunction) method, where myFunction can be either a org.apache.flink.streaming.api.functions.co.KeyedBroadcastProcessFunction or a org.apache.flink.streaming.api.functions.co.BroadcastProcessFunction depending on the current stream being a KeyedStream or not.

    broadcastStream

    The broadcast stream with the broadcast state to be connected with this stream.

    returns

    The BroadcastConnectedStream.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  17. def connect[T2](dataStream: DataStream[T2]): ConnectedStreams[T, T2]

    Creates a new ConnectedStreams by connecting DataStream outputs of different type with each other.

    Creates a new ConnectedStreams by connecting DataStream outputs of different type with each other. The DataStreams connected using this operators can be used with CoFunctions.

    Definition Classes
    DataStream
  18. def countWindow(size: Long): WindowedStream[T, K, GlobalWindow]

    Windows this KeyedStream into tumbling count windows.

    Windows this KeyedStream into tumbling count windows.

    size

    The size of the windows in number of elements.

  19. def countWindow(size: Long, slide: Long): WindowedStream[T, K, GlobalWindow]

    Windows this KeyedStream into sliding count windows.

    Windows this KeyedStream into sliding count windows.

    size

    The size of the windows in number of elements.

    slide

    The slide interval in number of elements.

  20. def countWindowAll(size: Long): AllWindowedStream[T, GlobalWindow]

    Windows this DataStream into tumbling count windows.

    Windows this DataStream into tumbling count windows.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    size

    The size of the windows in number of elements.

    Definition Classes
    DataStream
  21. def countWindowAll(size: Long, slide: Long): AllWindowedStream[T, GlobalWindow]

    Windows this DataStream into sliding count windows.

    Windows this DataStream into sliding count windows.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    size

    The size of the windows in number of elements.

    slide

    The slide interval in number of elements.

    Definition Classes
    DataStream
  22. def dataType: TypeInformation[T]

    Returns the TypeInformation for the elements of this DataStream.

    Returns the TypeInformation for the elements of this DataStream.

    Definition Classes
    DataStream
  23. def disableChaining(): DataStream[T]

    Turns off chaining for this operator so thread co-location will not be used as an optimization.

    Turns off chaining for this operator so thread co-location will not be used as an optimization. Chaining can be turned off for the whole job by StreamExecutionEnvironment.disableOperatorChaining() however it is not advised for performance considerations.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  24. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  25. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  26. def executeAndCollect(jobExecutionName: String, limit: Int): List[T]

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink's REST API.

    Definition Classes
    DataStream
  27. def executeAndCollect(limit: Int): List[T]

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink's REST API.

    Definition Classes
    DataStream
  28. def executeAndCollect(jobExecutionName: String): CloseableIterator[T]

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink's REST API.

    IMPORTANT The returned iterator must be closed to free all cluster resources.

    Definition Classes
    DataStream
  29. def executeAndCollect(): CloseableIterator[T]

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    Triggers the distributed execution of the streaming dataflow and returns an iterator over the elements of the given DataStream.

    The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink's REST API.

    IMPORTANT The returned iterator must be closed to free all cluster resources.

    Definition Classes
    DataStream
  30. def executionConfig: ExecutionConfig

    Returns the execution config.

    Returns the execution config.

    Definition Classes
    DataStream
  31. def executionEnvironment: StreamExecutionEnvironment

    Returns the StreamExecutionEnvironment associated with this data stream

    Returns the StreamExecutionEnvironment associated with this data stream

    Definition Classes
    DataStream
  32. def filter(fun: (T) ⇒ Boolean): DataStream[T]

    Creates a new DataStream that contains only the elements satisfying the given filter predicate.

    Creates a new DataStream that contains only the elements satisfying the given filter predicate.

    Definition Classes
    DataStream
  33. def filter(filter: FilterFunction[T]): DataStream[T]

    Creates a new DataStream that contains only the elements satisfying the given filter predicate.

    Creates a new DataStream that contains only the elements satisfying the given filter predicate.

    Definition Classes
    DataStream
  34. def filterWithState[S](fun: (T, Option[S]) ⇒ (Boolean, Option[S]))(implicit arg0: TypeInformation[S]): DataStream[T]

    Creates a new DataStream that contains only the elements satisfying the given stateful filter predicate.

    Creates a new DataStream that contains only the elements satisfying the given stateful filter predicate. To use state partitioning, a key must be defined using .keyBy(..), in which case an independent state will be kept per key.

    Note that the user state object needs to be serializable.

  35. def flatMap[R](fun: (T) ⇒ TraversableOnce[R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Creates a new DataStream by applying the given function to every element and flattening the results.

    Creates a new DataStream by applying the given function to every element and flattening the results.

    Definition Classes
    DataStream
  36. def flatMap[R](fun: (T, Collector[R]) ⇒ Unit)(implicit arg0: TypeInformation[R]): DataStream[R]

    Creates a new DataStream by applying the given function to every element and flattening the results.

    Creates a new DataStream by applying the given function to every element and flattening the results.

    Definition Classes
    DataStream
  37. def flatMap[R](flatMapper: FlatMapFunction[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Creates a new DataStream by applying the given function to every element and flattening the results.

    Creates a new DataStream by applying the given function to every element and flattening the results.

    Definition Classes
    DataStream
  38. def flatMapWithState[R, S](fun: (T, Option[S]) ⇒ (TraversableOnce[R], Option[S]))(implicit arg0: TypeInformation[R], arg1: TypeInformation[S]): DataStream[R]

    Creates a new DataStream by applying the given stateful function to every element and flattening the results.

    Creates a new DataStream by applying the given stateful function to every element and flattening the results. To use state partitioning, a key must be defined using .keyBy(..), in which case an independent state will be kept per key.

    Note that the user state object needs to be serializable.

  39. def forward: DataStream[T]

    Sets the partitioning of the DataStream so that the output tuples are forwarded to the local subtask of the next component (whenever possible).

    Sets the partitioning of the DataStream so that the output tuples are forwarded to the local subtask of the next component (whenever possible).

    Definition Classes
    DataStream
  40. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  41. def getKeyType: TypeInformation[K]

    Gets the type of the key by which this stream is keyed.

    Gets the type of the key by which this stream is keyed.

    Annotations
    @Internal()
  42. def getSideOutput[X](tag: OutputTag[X])(implicit arg0: TypeInformation[X]): DataStream[X]
    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  43. def global: DataStream[T]

    Sets the partitioning of the DataStream so that the output values all go to the first instance of the next processing operator.

    Sets the partitioning of the DataStream so that the output values all go to the first instance of the next processing operator. Use this setting with care since it might cause a serious performance bottleneck in the application.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  44. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  45. def intervalJoin[OTHER](otherStream: KeyedStream[OTHER, K]): IntervalJoin[T, OTHER, K]

    Join elements of this KeyedStream with elements of another KeyedStream over a time interval that can be specified with IntervalJoin.between.

    Join elements of this KeyedStream with elements of another KeyedStream over a time interval that can be specified with IntervalJoin.between.

    OTHER

    Type parameter of elements in the other stream

    otherStream

    The other keyed stream to join this keyed stream with

    returns

    An instance of IntervalJoin with this keyed stream and the other keyed stream

    Annotations
    @PublicEvolving()
  46. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  47. def iterate[R, F](stepFunction: (ConnectedStreams[T, F]) ⇒ (DataStream[F], DataStream[R]), maxWaitTimeMillis: Long)(implicit arg0: TypeInformation[F]): DataStream[R]

    Initiates an iterative part of the program that creates a loop by feeding back data streams.

    Initiates an iterative part of the program that creates a loop by feeding back data streams. To create a streaming iteration the user needs to define a transformation that creates two DataStreams. The first one is the output that will be fed back to the start of the iteration and the second is the output stream of the iterative part.

    The input stream of the iterate operator and the feedback stream will be treated as a ConnectedStreams where the input is connected with the feedback stream.

    This allows the user to distinguish standard input from feedback inputs.

    stepfunction: initialStream => (feedback, output)

    The user must set the max waiting time for the iteration head. If no data received in the set time the stream terminates. If this parameter is set to 0 then the iteration sources will indefinitely, so the job must be killed to stop.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  48. def iterate[R](stepFunction: (DataStream[T]) ⇒ (DataStream[T], DataStream[R]), maxWaitTimeMillis: Long = 0): DataStream[R]

    Initiates an iterative part of the program that creates a loop by feeding back data streams.

    Initiates an iterative part of the program that creates a loop by feeding back data streams. To create a streaming iteration the user needs to define a transformation that creates two DataStreams. The first one is the output that will be fed back to the start of the iteration and the second is the output stream of the iterative part.

    stepfunction: initialStream => (feedback, output)

    A common pattern is to use output splitting to create feedback and output DataStream. Please see the side outputs of ProcessFunction method of the DataStream

    By default a DataStream with iteration will never terminate, but the user can use the maxWaitTime parameter to set a max waiting time for the iteration head. If no data received in the set time the stream terminates.

    Parallelism of the feedback stream must match the parallelism of the original stream. Please refer to the setParallelism method for parallelism modification

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  49. def javaStream: flink.streaming.api.datastream.DataStream[T]

    Gets the underlying java DataStream object.

    Gets the underlying java DataStream object.

    Definition Classes
    DataStream
  50. def join[T2](otherStream: DataStream[T2]): JoinedStreams[T, T2]

    Creates a join operation.

    Creates a join operation. See JoinedStreams for an example of how the keys and window can be specified.

    Definition Classes
    DataStream
  51. def keyBy[K](fun: KeySelector[T, K])(implicit arg0: TypeInformation[K]): KeyedStream[T, K]

    Groups the elements of a DataStream by the given K key to be used with grouped operators like grouped reduce or grouped aggregations.

    Groups the elements of a DataStream by the given K key to be used with grouped operators like grouped reduce or grouped aggregations.

    Definition Classes
    DataStream
  52. def keyBy[K](fun: (T) ⇒ K)(implicit arg0: TypeInformation[K]): KeyedStream[T, K]

    Groups the elements of a DataStream by the given K key to be used with grouped operators like grouped reduce or grouped aggregations.

    Groups the elements of a DataStream by the given K key to be used with grouped operators like grouped reduce or grouped aggregations.

    Definition Classes
    DataStream
  53. def map[R](mapper: MapFunction[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Creates a new DataStream by applying the given function to every element of this DataStream.

    Creates a new DataStream by applying the given function to every element of this DataStream.

    Definition Classes
    DataStream
  54. def map[R](fun: (T) ⇒ R)(implicit arg0: TypeInformation[R]): DataStream[R]

    Creates a new DataStream by applying the given function to every element of this DataStream.

    Creates a new DataStream by applying the given function to every element of this DataStream.

    Definition Classes
    DataStream
  55. def mapWithState[R, S](fun: (T, Option[S]) ⇒ (R, Option[S]))(implicit arg0: TypeInformation[R], arg1: TypeInformation[S]): DataStream[R]

    Creates a new DataStream by applying the given stateful function to every element of this DataStream.

    Creates a new DataStream by applying the given stateful function to every element of this DataStream. To use state partitioning, a key must be defined using .keyBy(..), in which case an independent state will be kept per key.

    Note that the user state object needs to be serializable.

  56. def max(field: String): DataStream[T]

    Applies an aggregation that that gives the current maximum of the data stream at the given field by the given key.

    Applies an aggregation that that gives the current maximum of the data stream at the given field by the given key. An independent aggregate is kept per key.

    field

    In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).

  57. def max(position: Int): DataStream[T]

    Applies an aggregation that that gives the current maximum of the data stream at the given position by the given key.

    Applies an aggregation that that gives the current maximum of the data stream at the given position by the given key. An independent aggregate is kept per key.

    position

    The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).

  58. def maxBy(field: String): DataStream[T]

    Applies an aggregation that that gives the current maximum element of the data stream by the given field by the given key.

    Applies an aggregation that that gives the current maximum element of the data stream by the given field by the given key. An independent aggregate is kept per key. When equality, the first element is returned with the maximal value.

    field

    In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).

  59. def maxBy(position: Int): DataStream[T]

    Applies an aggregation that that gives the current maximum element of the data stream by the given position by the given key.

    Applies an aggregation that that gives the current maximum element of the data stream by the given position by the given key. An independent aggregate is kept per key. When equality, the first element is returned with the maximal value.

    position

    The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).

  60. def min(field: String): DataStream[T]

    Applies an aggregation that that gives the current minimum of the data stream at the given field by the given key.

    Applies an aggregation that that gives the current minimum of the data stream at the given field by the given key. An independent aggregate is kept per key.

    field

    In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).

  61. def min(position: Int): DataStream[T]

    Applies an aggregation that that gives the current minimum of the data stream at the given position by the given key.

    Applies an aggregation that that gives the current minimum of the data stream at the given position by the given key. An independent aggregate is kept per key.

    position

    The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).

  62. def minBy(field: String): DataStream[T]

    Applies an aggregation that that gives the current minimum element of the data stream by the given field by the given key.

    Applies an aggregation that that gives the current minimum element of the data stream by the given field by the given key. An independent aggregate is kept per key. When equality, the first element is returned with the minimal value.

    field

    In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).

  63. def minBy(position: Int): DataStream[T]

    Applies an aggregation that that gives the current minimum element of the data stream by the given position by the given key.

    Applies an aggregation that that gives the current minimum element of the data stream by the given position by the given key. An independent aggregate is kept per key. When equality, the first element is returned with the minimal value.

    position

    The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).

  64. def minResources: ResourceSpec

    Returns the minimum resources of this operation.

    Returns the minimum resources of this operation.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  65. def name(name: String): DataStream[T]

    Sets the name of the current data stream.

    Sets the name of the current data stream. This name is used by the visualization and logging during runtime.

    returns

    The named operator

    Definition Classes
    DataStream
  66. def name: String

    Gets the name of the current data stream.

    Gets the name of the current data stream. This name is used by the visualization and logging during runtime.

    returns

    Name of the stream.

    Definition Classes
    DataStream
  67. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  68. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  69. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  70. def parallelism: Int

    Returns the parallelism of this operation.

    Returns the parallelism of this operation.

    Definition Classes
    DataStream
  71. def partitionCustom[K](partitioner: Partitioner[K], fun: (T) ⇒ K)(implicit arg0: TypeInformation[K]): DataStream[T]

    Partitions a DataStream on the key returned by the selector, using a custom partitioner.

    Partitions a DataStream on the key returned by the selector, using a custom partitioner. This method takes the key selector to get the key to partition on, and a partitioner that accepts the key type.

    Note: This method works only on single field keys, i.e. the selector cannot return tuples of fields.

    Definition Classes
    DataStream
  72. def preferredResources: ResourceSpec

    Returns the preferred resources of this operation.

    Returns the preferred resources of this operation.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  73. def print(sinkIdentifier: String): DataStreamSink[T]

    Writes a DataStream to the standard output stream (stdout).

    Writes a DataStream to the standard output stream (stdout). For each element of the DataStream the result of AnyRef.toString() is written.

    sinkIdentifier

    The string to prefix the output with.

    returns

    The closed DataStream.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  74. def print(): DataStreamSink[T]

    Writes a DataStream to the standard output stream (stdout).

    Writes a DataStream to the standard output stream (stdout). For each element of the DataStream the result of .toString is written.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  75. def printToErr(sinkIdentifier: String): DataStreamSink[T]

    Writes a DataStream to the standard error stream (stderr).

    Writes a DataStream to the standard error stream (stderr).

    For each element of the DataStream the result of AnyRef.toString() is written.

    sinkIdentifier

    The string to prefix the output with.

    returns

    The closed DataStream.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  76. def printToErr(): DataStreamSink[T]

    Writes a DataStream to the standard error stream (stderr).

    Writes a DataStream to the standard error stream (stderr).

    For each element of the DataStream the result of AnyRef.toString() is written.

    returns

    The closed DataStream.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  77. def process[R](keyedProcessFunction: KeyedProcessFunction[K, T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Applies the given KeyedProcessFunction on the input stream, thereby creating a transformed output stream.

    Applies the given KeyedProcessFunction on the input stream, thereby creating a transformed output stream.

    The function will be called for every element in the stream and can produce zero or more output. The function can also query the time and set timers. When reacting to the firing of set timers the function can emit yet more elements.

    The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the DataStream#flatMap(FlatMapFunction) function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.

    keyedProcessFunction

    The KeyedProcessFunction that is called for each element in the stream.

    Annotations
    @PublicEvolving()
  78. def rebalance: DataStream[T]

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to the next component.

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to the next component.

    Definition Classes
    DataStream
  79. def reduce(fun: (T, T) ⇒ T): DataStream[T]

    Creates a new DataStream by reducing the elements of this DataStream using an associative reduce function.

    Creates a new DataStream by reducing the elements of this DataStream using an associative reduce function. An independent aggregate is kept per key.

  80. def reduce(reducer: ReduceFunction[T]): DataStream[T]

    Creates a new DataStream by reducing the elements of this DataStream using an associative reduce function.

    Creates a new DataStream by reducing the elements of this DataStream using an associative reduce function. An independent aggregate is kept per key.

  81. def rescale: DataStream[T]

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to a subset of instances of the downstream operation.

    Sets the partitioning of the DataStream so that the output tuples are distributed evenly to a subset of instances of the downstream operation.

    The subset of downstream operations to which the upstream operation sends elements depends on the degree of parallelism of both the upstream and downstream operation. For example, if the upstream operation has parallelism 2 and the downstream operation has parallelism 4, then one upstream operation would distribute elements to two downstream operations while the other upstream operation would distribute to the other two downstream operations. If, on the other hand, the downstream operation has parallelism 2 while the upstream operation has parallelism 4 then two upstream operations will distribute to one downstream operation while the other two upstream operations will distribute to the other downstream operations.

    In cases where the different parallelisms are not multiples of each other one or several downstream operations will have a differing number of inputs from upstream operations.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  82. def setBufferTimeout(timeoutMillis: Long): DataStream[T]

    Sets the maximum time frequency (ms) for the flushing of the output buffer.

    Sets the maximum time frequency (ms) for the flushing of the output buffer. By default the output buffers flush only when they are full.

    timeoutMillis

    The maximum time between two output flushes.

    returns

    The operator with buffer timeout set.

    Definition Classes
    DataStream
  83. def setDescription(description: String): DataStream[T]

    Sets the description of this data stream.

    Sets the description of this data stream.

    Description is used in json plan and web ui, but not in logging and metrics where only name is available. Description is expected to provide detailed information about this operation, while name is expected to be more simple, providing summary information only, so that we can have more user-friendly logging messages and metric tags without losing useful messages for debugging.

    returns

    The operator with new description

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  84. def setMaxParallelism(maxParallelism: Int): DataStream[T]
    Definition Classes
    DataStream
  85. def setParallelism(parallelism: Int): DataStream[T]

    Sets the parallelism of this operation.

    Sets the parallelism of this operation. This must be at least 1.

    Definition Classes
    DataStream
  86. def setUidHash(hash: String): DataStream[T]

    Sets an user provided hash for this operator.

    Sets an user provided hash for this operator. This will be used AS IS the create the JobVertexID.

    The user provided hash is an alternative to the generated hashes, that is considered when identifying an operator through the default hash mechanics fails (e.g. because of changes between Flink versions).

    Important: this should be used as a workaround or for trouble shooting. The provided hash needs to be unique per transformation and job. Otherwise, job submission will fail. Furthermore, you cannot assign user-specified hash to intermediate nodes in an operator chain and trying so will let your job fail.

    hash

    the user provided hash for this operator.

    returns

    The operator with the user provided hash.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  87. def shuffle: DataStream[T]

    Sets the partitioning of the DataStream so that the output tuples are shuffled to the next component.

    Sets the partitioning of the DataStream so that the output tuples are shuffled to the next component.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  88. def sinkTo(sink: Sink[T]): DataStreamSink[T]

    Adds the given sink to this DataStream.

    Adds the given sink to this DataStream. Only streams with sinks added will be executed once the StreamExecutionEnvironment.execute(...) method is called.

    Definition Classes
    DataStream
  89. def sinkTo(sink: Sink[T, _, _, _]): DataStreamSink[T]

    Adds the given sink to this DataStream.

    Adds the given sink to this DataStream. Only streams with sinks added will be executed once the StreamExecutionEnvironment.execute(...) method is called.

    Definition Classes
    DataStream
  90. def slotSharingGroup(slotSharingGroup: SlotSharingGroup): DataStream[T]

    Sets the slot sharing group of this operation.

    Sets the slot sharing group of this operation. Parallel instances of operations that are in the same slot sharing group will be co-located in the same TaskManager slot, if possible.

    Operations inherit the slot sharing group of input operations if all input operations are in the same slot sharing group and no slot sharing group was explicitly specified.

    Initially an operation is in the default slot sharing group. An operation can be put into the default group explicitly by setting the slot sharing group to "default".

    slotSharingGroup

    Which contains name and its resource spec.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  91. def slotSharingGroup(slotSharingGroup: String): DataStream[T]

    Sets the slot sharing group of this operation.

    Sets the slot sharing group of this operation. Parallel instances of operations that are in the same slot sharing group will be co-located in the same TaskManager slot, if possible.

    Operations inherit the slot sharing group of input operations if all input operations are in the same slot sharing group and no slot sharing group was explicitly specified.

    Initially an operation is in the default slot sharing group. An operation can be put into the default group explicitly by setting the slot sharing group to "default".

    slotSharingGroup

    The slot sharing group name.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  92. def startNewChain(): DataStream[T]

    Starts a new task chain beginning at this operator.

    Starts a new task chain beginning at this operator. This operator will not be chained (thread co-located for increased performance) to any previous tasks even if possible.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  93. def sum(field: String): DataStream[T]

    Applies an aggregation that sums the data stream at the given field by the given key.

    Applies an aggregation that sums the data stream at the given field by the given key. An independent aggregate is kept per key.

    field

    In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in "field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).

  94. def sum(position: Int): DataStream[T]

    Applies an aggregation that sums the data stream at the given position by the given key.

    Applies an aggregation that sums the data stream at the given position by the given key. An independent aggregate is kept per key.

    position

    The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).

  95. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  96. def toString(): String
    Definition Classes
    AnyRef → Any
  97. def transform[R](operatorName: String, operator: OneInputStreamOperator[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Transforms the DataStream by using a custom OneInputStreamOperator.

    Transforms the DataStream by using a custom OneInputStreamOperator.

    R

    the type of elements emitted by the operator

    operatorName

    name of the operator, for logging purposes

    operator

    the object containing the transformation logic

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  98. def uid(uid: String): DataStream[T]

    Sets an ID for this operator.

    Sets an ID for this operator.

    The specified ID is used to assign the same operator ID across job submissions (for example when starting a job from a savepoint).

    Important: this ID needs to be unique per transformation and job. Otherwise, job submission will fail.

    uid

    The unique user-specified ID of this transformation.

    returns

    The operator with the specified ID.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  99. def union(dataStreams: DataStream[T]*): DataStream[T]

    Creates a new DataStream by merging DataStream outputs of the same type with each other.

    Creates a new DataStream by merging DataStream outputs of the same type with each other. The DataStreams merged using this operator will be transformed simultaneously.

    Definition Classes
    DataStream
  100. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  101. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  102. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  103. def window[W <: Window](assigner: WindowAssigner[_ >: T, W]): WindowedStream[T, K, W]

    Windows this data stream to a WindowedStream, which evaluates windows over a key grouped stream.

    Windows this data stream to a WindowedStream, which evaluates windows over a key grouped stream. Elements are put into windows by a WindowAssigner. The grouping of elements is done both by key and by window.

    A org.apache.flink.streaming.api.windowing.triggers.Trigger can be defined to specify when windows are evaluated. However, WindowAssigner have a default Trigger that is used if a Trigger is not specified.

    assigner

    The WindowAssigner that assigns elements to windows.

    returns

    The trigger windows data stream.

    Annotations
    @PublicEvolving()
  104. def windowAll[W <: Window](assigner: WindowAssigner[_ >: T, W]): AllWindowedStream[T, W]

    Windows this data stream to a AllWindowedStream, which evaluates windows over a key grouped stream.

    Windows this data stream to a AllWindowedStream, which evaluates windows over a key grouped stream. Elements are put into windows by a WindowAssigner. The grouping of elements is done both by key and by window.

    A org.apache.flink.streaming.api.windowing.triggers.Trigger can be defined to specify when windows are evaluated. However, WindowAssigner have a default Trigger that is used if a Trigger is not specified.

    Note: This operation can be inherently non-parallel since all elements have to pass through the same operator instance. (Only for special cases, such as aligned time windows is it possible to perform this operation in parallel).

    assigner

    The WindowAssigner that assigns elements to windows.

    returns

    The trigger windows data stream.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  105. def writeToSocket(hostname: String, port: Integer, schema: SerializationSchema[T]): DataStreamSink[T]

    Writes the DataStream to a socket as a byte array.

    Writes the DataStream to a socket as a byte array. The format of the output is specified by a SerializationSchema.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()
  106. def writeUsingOutputFormat(format: OutputFormat[T]): DataStreamSink[T]

    Writes a DataStream using the given OutputFormat.

    Writes a DataStream using the given OutputFormat.

    Definition Classes
    DataStream
    Annotations
    @PublicEvolving()

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated
    Deprecated
  2. def getExecutionConfig: ExecutionConfig

    Returns the execution config.

    Returns the execution config.

    Definition Classes
    DataStream
    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  3. def getExecutionEnvironment: StreamExecutionEnvironment

    Returns the StreamExecutionEnvironment associated with the current DataStream.

    Returns the StreamExecutionEnvironment associated with the current DataStream.

    returns

    associated execution environment

    Definition Classes
    DataStream
    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  4. def getName: String

    Gets the name of the current data stream.

    Gets the name of the current data stream. This name is used by the visualization and logging during runtime.

    returns

    Name of the stream.

    Definition Classes
    DataStream
    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  5. def getParallelism: Int

    Returns the parallelism of this operation.

    Returns the parallelism of this operation.

    Definition Classes
    DataStream
    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  6. def getType(): TypeInformation[T]

    Returns the TypeInformation for the elements of this DataStream.

    Returns the TypeInformation for the elements of this DataStream.

    Definition Classes
    DataStream
    Annotations
    @deprecated @PublicEvolving()
    Deprecated
  7. def keyBy(firstField: String, otherFields: String*): KeyedStream[T, Tuple]

    Groups the elements of a DataStream by the given field expressions to be used with grouped operators like grouped reduce or grouped aggregations.

    Groups the elements of a DataStream by the given field expressions to be used with grouped operators like grouped reduce or grouped aggregations.

    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    use DataStream.keyBy(KeySelector) instead

  8. def keyBy(fields: Int*): KeyedStream[T, Tuple]

    Groups the elements of a DataStream by the given key positions (for tuple/array types) to be used with grouped operators like grouped reduce or grouped aggregations.

    Groups the elements of a DataStream by the given key positions (for tuple/array types) to be used with grouped operators like grouped reduce or grouped aggregations.

    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    use DataStream.keyBy(KeySelector) instead

  9. def partitionCustom[K](partitioner: Partitioner[K], field: String)(implicit arg0: TypeInformation[K]): DataStream[T]

    Partitions a POJO DataStream on the specified key fields using a custom partitioner.

    Partitions a POJO DataStream on the specified key fields using a custom partitioner. This method takes the key expression to partition on, and a partitioner that accepts the key type.

    Note: This method works only on single field keys.

    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    Use Function1) instead

  10. def partitionCustom[K](partitioner: Partitioner[K], field: Int)(implicit arg0: TypeInformation[K]): DataStream[T]

    Partitions a tuple DataStream on the specified key fields using a custom partitioner.

    Partitions a tuple DataStream on the specified key fields using a custom partitioner. This method takes the key position to partition on, and a partitioner that accepts the key type.

    Note: This method works only on single field keys.

    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    Use Function1) instead

  11. def process[R](processFunction: ProcessFunction[T, R])(implicit arg0: TypeInformation[R]): DataStream[R]

    Applies the given ProcessFunction on the input stream, thereby creating a transformed output stream.

    Applies the given ProcessFunction on the input stream, thereby creating a transformed output stream.

    The function will be called for every element in the stream and can produce zero or more output. The function can also query the time and set timers. When reacting to the firing of set timers the function can emit yet more elements.

    The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the DataStream#flatMap(FlatMapFunction) function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.

    processFunction

    The ProcessFunction that is called for each element in the stream.

    Definition Classes
    KeyedStreamDataStream
    Annotations
    @deprecated @PublicEvolving()
    Deprecated

    will be removed in a future version

  12. def timeWindow(size: Time, slide: Time): WindowedStream[T, K, TimeWindow]

    Windows this KeyedStream into sliding time windows.

    Windows this KeyedStream into sliding time windows.

    This is a shortcut for either .window(SlidingEventTimeWindows.of(size)) or .window(SlidingProcessingTimeWindows.of(size)) depending on the time characteristic set using StreamExecutionEnvironment.setStreamTimeCharacteristic()

    size

    The size of the window.

    Annotations
    @deprecated
    Deprecated
  13. def timeWindow(size: Time): WindowedStream[T, K, TimeWindow]

    Windows this KeyedStream into tumbling time windows.

    Windows this KeyedStream into tumbling time windows.

    This is a shortcut for either .window(TumblingEventTimeWindows.of(size)) or .window(TumblingProcessingTimeWindows.of(size)) depending on the time characteristic set using StreamExecutionEnvironment.setStreamTimeCharacteristic()

    size

    The size of the window.

    Annotations
    @deprecated
    Deprecated

Inherited from DataStream[T]

Inherited from AnyRef

Inherited from Any

Ungrouped