groupByKeyAndWindow

fun <K, V> JavaDStream<Tuple2<K, V>>.groupByKeyAndWindow(    windowDuration: Duration,     slideDuration: Duration = dstream().slideDuration(),     numPartitions: Int = dstream().ssc().sc().defaultParallelism()): JavaDStream<Tuple2<K, Iterable<V>>>

Return a new DStream by applying groupByKey over a sliding window on this DStream. Similar to DStream.groupByKey(), but applies it over a sliding window. Hash partitioning is used to generate the RDDs with numPartitions partitions.

Parameters

windowDuration

width of the window; must be a multiple of this DStream's batching interval

slideDuration

sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval

numPartitions

number of partitions of each RDD in the new DStream; if not specified then Spark's default number of partitions will be used


fun <K, V> JavaDStream<Tuple2<K, V>>.groupByKeyAndWindow(    windowDuration: Duration,     slideDuration: Duration = dstream().slideDuration(),     partitioner: Partitioner): JavaDStream<Tuple2<K, Iterable<V>>>

Create a new DStream by applying groupByKey over a sliding window on this DStream. Similar to DStream.groupByKey(), but applies it over a sliding window.

Parameters

windowDuration

width of the window; must be a multiple of this DStream's batching interval

slideDuration

sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval

partitioner

partitioner for controlling the partitioning of each RDD in the new DStream.