percentile

null values in the input are ignored. The operations either throw an exception when the input is empty (after filtering null or NaN values), or they return null when using the -orNull overloads.

All primitive numeric types are supported: Byte, Short, Int, Long, Float, and Double, but no mix of different number types. In these cases, the return type is always Double?. The results of the operation on these types are interpolated using Quantile Estimation Method R8.

The operation is also available for self-comparable columns (so columns of type T : Comparable<T>, like DateTime, String, etc.) In this case, the return type remains T?. The index of the result of the operation on these types is rounded using Quantile Estimation Method R3.

All operations on Double/Float have the skipNaN option, which is set to false by default. This means that if a NaN is present in the input, it will be propagated to the result. When it's set to true, NaN values are ignored.

Quantile Estimation Methods

For the percentile operation, DataFrame uses estimation method R3 when the given percentile needs to be selected from the values (like for self-comparable columns), and R8 when the given percentile can be interpolated from the values (of a numeric column). R8 was the recommended method by Hyndman and Fan, though other libraries, like Numpy default to R7, so slightly different results are to be expected.

df.percentile(25.0) // 25th percentile of values per every comparable column
df.percentile(75.0) { age and weight } // 75th percentile of all values in `age` and `weight`
df.percentileFor(50.0, skipNaN = true) { age and weight } // 50th percentile of values per `age` and `weight` separately
df.percentileOf(75.0) { (weight ?: 0) / age } // 75th percentile of expression evaluated for every row
df.percentileBy(25.0) { age } // DataRow where the 25th percentile of `age` lies (index rounded using R3)

df.percentile(25.0)
df.age.percentile(75.0)
df.groupBy { city }.percentile(50.0)
df.pivot { city }.percentile(75.0)
df.pivot { city }.groupBy { name.lastName }.percentile(25.0)

Type Conversion

The following automatic type conversions are performed for the percentile operation. (Note that null only appears in the return type when using -orNull overloads).

Conversion	Result for Empty Input
T -> T where T : Comparable\<T\>	null
Int -> Double	null
Byte -> Double	null
Short -> Double	null
Long -> Double	null
Double -> Double	null
Float -> Double	null
Nothing -> Nothing	null

percentile﻿

Quantile Estimation Methods﻿

Type Conversion﻿

percentile

Quantile Estimation Methods

Type Conversion