reduceByKeyAndWindow
Return a new DStream by applying reduceByKey
over a sliding window. This is similar to DStream.reduceByKey()
but applies it over a sliding window. Hash partitioning is used to generate the RDDs with numPartitions
partitions.
Parameters
associative and commutative reduce function
width of the window; must be a multiple of this DStream's batching interval
sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval
number of partitions of each RDD in the new DStream.
Return a new DStream by applying reduceByKey
over a sliding window. Similar to DStream.reduceByKey()
, but applies it over a sliding window.
Parameters
associative and commutative reduce function
width of the window; must be a multiple of this DStream's batching interval
sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval
partitioner for controlling the partitioning of each RDD in the new DStream.
Return a new DStream by applying incremental reduceByKey
over a sliding window. The reduced value of over a new window is calculated using the old window's reduced value :
reduce the new values that entered the window (e.g., adding new counts)
"inverse reduce" the old values that left the window (e.g., subtracting old counts)
This is more efficient than reduceByKeyAndWindow without "inverse reduce" function. However, it is applicable to only "invertible reduce functions". Hash partitioning is used to generate the RDDs with Spark's default number of partitions.
Parameters
associative and commutative reduce function
inverse reduce function; such that for all y, invertible x: invReduceFunc(reduceFunc(x, y), x) = y
width of the window; must be a multiple of this DStream's batching interval
sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval
Optional function to filter expired key-value pairs; only pairs that satisfy the function are retained
Return a new DStream by applying incremental reduceByKey
over a sliding window. The reduced value of over a new window is calculated using the old window's reduced value :
reduce the new values that entered the window (e.g., adding new counts)
"inverse reduce" the old values that left the window (e.g., subtracting old counts) This is more efficient than reduceByKeyAndWindow without "inverse reduce" function. However, it is applicable to only "invertible reduce functions".
Parameters
associative and commutative reduce function
inverse reduce function
width of the window; must be a multiple of this DStream's batching interval
sliding interval of the window (i.e., the interval after which the new DStream will generate RDDs); must be a multiple of this DStream's batching interval
partitioner for controlling the partitioning of each RDD in the new DStream.
Optional function to filter expired key-value pairs; only pairs that satisfy the function are retained