countByKeyApprox
fun <K, V> JavaRDD<Tuple2<K, V>>.countByKeyApprox(timeout: Long, confidence: Double = 0.95): PartialResult<Map<K, BoundedDouble>>
Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
The confidence is the probability that the error bounds of the result will contain the true value. That is, if countApprox were called repeatedly with confidence 0.9, we would expect 90% of the results to contain the true count. The confidence must be in the range <0,1> or an exception will be thrown.
Return
a potentially incomplete result, with error bounds
Parameters
timeout
maximum time to wait for the job, in milliseconds
confidence
the desired statistical confidence in the result