join

fun <K, V, W> JavaRDD<Tuple2<K, V>>.join(other: JavaRDD<Tuple2<K, W>>, partitioner: Partitioner): JavaRDD<Tuple2<K, Tuple2<V, W>>>

Return an RDD containing all pairs of elements with matching keys in this and other. Each pair of elements will be returned as a (k, (v1, v2)) tuple, where (k, v1) is in this and (k, v2) is in other. Uses the given Partitioner to partition the output RDD.


fun <K, V, W> JavaRDD<Tuple2<K, V>>.join(other: JavaRDD<Tuple2<K, W>>): JavaRDD<Tuple2<K, Tuple2<V, W>>>
fun <K, V, W> JavaRDD<Tuple2<K, V>>.join(other: JavaRDD<Tuple2<K, W>>, numPartitions: Int): JavaRDD<Tuple2<K, Tuple2<V, W>>>

Return an RDD containing all pairs of elements with matching keys in this and other. Each pair of elements will be returned as a (k, (v1, v2)) tuple, where (k, v1) is in this and (k, v2) is in other. Performs a hash join across the cluster.


fun <K, V, W> JavaDStream<Tuple2<K, V>>.join(    other: JavaDStream<Tuple2<K, W>>,     numPartitions: Int = dstream().ssc().sc().defaultParallelism()): JavaDStream<Tuple2<K, Tuple2<V, W>>>

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream. Hash partitioning is used to generate the RDDs with numPartitions partitions.


fun <K, V, W> JavaDStream<Tuple2<K, V>>.join(    other: JavaDStream<Tuple2<K, W>>,     partitioner: Partitioner): JavaDStream<Tuple2<K, Tuple2<V, W>>>

Return a new DStream by applying 'join' between RDDs of this DStream and other DStream. The supplied org.apache.spark.Partitioner is used to control the partitioning of each RDD.