distinct
Removes duplicate rows. The rows in the resulting DataFrame
are in the same order as they were in the original DataFrame
.
df.distinct()
If columns are specified, resulting DataFrame
will have only given columns with distinct values.
df.distinct { age and name }
// same as
df.select { age and name }.distinct()
val age by column<Int>()
val name by columnGroup()
df.distinct { age and name }
// same as
df.select { age and name }.distinct()
df.distinct("age", "name")
// same as
df.select("age", "name").distinct()
distinctBy
Keep only the first row for every group of rows grouped by some condition.
df.distinctBy { age and name }
// same as
df.groupBy { age and name }.mapToRows { group.first() }
val age by column<Int>()
val name by columnGroup()
val firstName by name.column<String>()
df.distinctBy { age and name }
// same as
df.groupBy { age and name }.mapToRows { group.first() }
df.distinctBy("age", "name")
// same as
df.groupBy("age", "name").mapToRows { group.first() }
Last modified: 18 July 2024