kotlin-spark-api_3.3.2_2.13/org.jetbrains.kotlinx.spark.api/reduceByKey

reduceByKey

fun <K, V> JavaRDD<Tuple2<K, V>>.reduceByKey(partitioner: Partitioner, func: (V, V) -> V): JavaRDD<Tuple2<K, V>>

Merge the values for each key using an associative and commutative reduce function. This will also perform the merging locally on each mapper before sending results to a reducer, similarly to a "combiner" in MapReduce.

fun <K, V> JavaRDD<Tuple2<K, V>>.reduceByKey(numPartitions: Int, func: (V, V) -> V): JavaRDD<Tuple2<K, V>>

fun <K, V> JavaRDD<Tuple2<K, V>>.reduceByKey(func: (V, V) -> V): JavaRDD<Tuple2<K, V>>

fun <K, V> JavaDStream<Tuple2<K, V>>.reduceByKey( numPartitions: Int = dstream().ssc().sc().defaultParallelism(), reduceFunc: (V, V) -> V): JavaDStream<Tuple2<K, V>>

Return a new DStream by applying reduceByKey to each RDD. The values for each key are merged using the supplied reduce function. Hash partitioning is used to generate the RDDs with numPartitions partitions.

fun <K, V> JavaDStream<Tuple2<K, V>>.reduceByKey(partitioner: Partitioner, reduceFunc: (V, V) -> V): JavaDStream<Tuple2<K, V>>

Return a new DStream by applying reduceByKey to each RDD. The values for each key are merged using the supplied reduce function. org.apache.spark.Partitioner is used to control the partitioning of each RDD.