distinct
Removes duplicate rows. The rows in the resulting DataFrame
are in the same order as they were in the original DataFrame
.
df.distinct()
If columns are specified, resulting DataFrame
will have only given columns with distinct values.
See column selectors for how to select the columns for this operation.
Properties
Strings
df.distinct { age and name }
// same as
df.select { age and name }.distinct()
df.distinct("age", "name")
// same as
df.select("age", "name").distinct()
Keep only the first row for every group of rows grouped by some condition.
See column selectors for how to select the columns for this operation.
Properties
Strings
df.distinctBy { age and name }
// same as
df.groupBy { age and name }.mapToRows { group.first() }
df.distinctBy("age", "name")
// same as
df.groupBy("age", "name").mapToRows { group.first() }