DataFrame 1.0 Help

distinct

Removes duplicate rows. The rows in the resulting DataFrame are in the same order as they were in the original DataFrame.

Related operations: Filter rows

df.distinct()

If columns are specified, resulting DataFrame will have only given columns with distinct values.

See column selectors for how to select the columns for this operation.

df.distinct { age and name } // same as df.select { age and name }.distinct()
df.distinct("age", "name") // same as df.select("age", "name").distinct()

distinctBy

Keep only the first row for every group of rows grouped by some condition.

See column selectors for how to select the columns for this operation.

df.distinctBy { age and name } // same as df.groupBy { age and name }.mapToRows { group.first() }
df.distinctBy("age", "name") // same as df.groupBy("age", "name").mapToRows { group.first() }
22 August 2025