add
Returns DataFrame
which contains all columns from original DataFrame
followed by newly added columns. Original DataFrame
is not modified.
Create a new column and add it to DataFrame
add(columnName: String) { rowExpression }
rowExpression: DataRow.(DataRow) -> Value
df.add("year of birth") { 2021 - age }
val age by column<Int>()
val yearOfBirth by column<Int>("year of birth")
df.add(yearOfBirth) { 2021 - age }
df.add("year of birth") { 2021 - "age"<Int>() }
See row expressions
You can use the newValue()
function to access value that was already calculated for the preceding row. It is helpful for recurrent computations:
df.add("fibonacci") {
if (index() < 2) 1
else prev()!!.newValue<Int>() + prev()!!.prev()!!.newValue<Int>()
}
Create and add several columns to DataFrame
add {
columnMapping
columnMapping
...
}
columnMapping = column into columnName
| columnName from column
| columnName from { rowExpression }
| columnGroupName {
columnMapping
columnMapping
...
}
df.add {
"year of birth" from 2021 - age
age gt 18 into "is adult"
"details" {
name.lastName.length() into "last name length"
"full name" from { name.firstName + " " + name.lastName }
}
}
val yob = column<Int>("year of birth")
val lastNameLength = column<Int>("last name length")
val age by column<Int>()
val isAdult = column<Boolean>("is adult")
val fullName = column<String>("full name")
val name by columnGroup()
val details by columnGroup()
val firstName by name.column<String>()
val lastName by name.column<String>()
df.add {
yob from 2021 - age
age gt 18 into isAdult
details from {
lastName.length() into lastNameLength
fullName from { firstName() + " " + lastName() }
}
}
df.add {
"year of birth" from 2021 - "age"<Int>()
"age"<Int>() gt 18 into "is adult"
"details" {
"name"["lastName"]<String>().length() into "last name length"
"full name" from { "name"["firstName"]<String>() + " " + "name"["lastName"]<String>() }
}
}
Create columns using intermediate result
Consider this API:
class CityInfo(val city: String?, val population: Int, val location: String)
fun queryCityInfo(city: String?): CityInfo = CityInfo(city, city?.length ?: 0, "35.5 32.2")
Use the following approach to add multiple columns by calling the given API only once per row:
val personWithCityInfo = df.add {
val cityInfo = city.map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
val city by column<String?>()
val personWithCityInfo = df.add {
val cityInfo = city().map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
val personWithCityInfo = df.add {
val cityInfo = "city"<String?>().map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
Add existing column to DataFrame
val score by columnOf(4, 3, 5, 2, 1, 3, 5)
df.add(score)
df + score
Add all columns from another DataFrame
df.add(df1, df2)
addId
Adds a column with sequential values 0, 1, 2,... The new column will be added in the beginning of the column list and will become the first column in DataFrame
.
addId(name: String = "id")
Parameters:
name: String = "id"
- name of the new column.
Last modified: 27 September 2024