DataRow
DataRow
represents a single record, one piece of data within a DataFrame
index(): Int
— sequential row number inDataFrame
, starts from 0prev(): DataRow?
— previous row (null
for the first row)next(): DataRow?
— next row (null
for the last row)diff(T) { rowExpression }: T / diffOrNull { rowExpression }: T?
— difference between the results of a row expression calculated for current and previous rowsexplode(columns): DataFrame<T>
— spread lists andDataFrame
objects vertically into new rowsvalues(): List<Any?>
— list of all cell values from the current rowvaluesOf<T>(): List<T>
— list of values of the given typecolumnsCount(): Int
— number of columnscolumnNames(): List<String>
— list of all column namescolumnTypes(): List<KType>
— list of all column typesnamedValues(): List<NameValuePair<Any?>>
— list of name-value pairs wherename
is a column name andvalue
is cell valuenamedValuesOf<T>(): List<NameValuePair<T>>
— list of name-value pairs where value has given typetranspose(): DataFrame<NameValuePair<*>>
—DataFrame
of two columns:name: String
is column names andvalue: Any?
is cell valuestransposeTo<T>(): DataFrame<NameValuePair<T>>
—DataFrame
of two columns:name: String
is column names andvalue: T
is cell valuesgetRow(Int): DataRow
— row fromDataFrame
by row indexgetRows(Iterable<Int>): DataFrame
—DataFrame
with subset of rows selected by absolute row index.relative(Iterable<Int>): DataFrame
—DataFrame
with subset of rows selected by relative row index:relative(-1..1)
will return previous, current and next row. Requested indices will be coerced to the valid range and invalid indices will be skippedgetValue<T>(columnName)
— cell value of typeT
by this row and givencolumnName
getValueOrNull<T>(columnName)
— cell value of typeT?
by this row and givencolumnName
ornull
if there's no such columnget(column): T
— cell value by this row and givencolumn
String.invoke<T>(): T
— cell value of typeT
by this row and giventhis
column nameColumnPath.invoke<T>(): T
— cell value of typeT
by this row and giventhis
column pathColumnReference.invoke(): T
— cell value of typeT
by this row and giventhis
columndf()
—DataFrame
that current row belongs to
Row expressions provide a value for every row of DataFrame
and are used in add, filter, forEach, update and other operations.
// Row expression computes values for a new column
df.add("fullName") { name.firstName + " " + name.lastName }
// Row expression computes updated values
df.update { weight }.at(1, 3, 4).with { prev()?.weight }
// Row expression computes cell content for values of pivoted column
df.pivot { city }.with { name.lastName.uppercase() }
Row expression signature: DataRow.(DataRow) -> T
. Row values can be accessed with or without it
keyword. Implicit and explicit argument represent the same DataRow
object.
Row condition is a special case of row expression that returns Boolean
.
// Row condition is used to filter rows by index
df.filter { index() % 5 == 0 }
// Row condition is used to drop rows where `age` is the same as in the previous row
df.drop { diffOrNull { age } == 0 }
// Row condition is used to filter rows for value update
df.update { weight }.where { index() > 4 && city != "Paris" }.with { 50 }
Row condition signature: DataRow.(DataRow) -> Boolean
The following statistics are available for DataRow
:
rowSum
rowMean
rowStd
These statistics will be applied only to values of appropriate types, and incompatible values will be ignored. For example, if a dataframe has columns of types String
and Int
, rowSum()
will compute the sum of the Int
values in the row and ignore String
values.
To apply statistics only to values of a particular type use -Of
versions:
rowSumOf<T>
rowMeanOf<T>
rowStdOf<T>
rowMinOf<T>
rowMaxOf<T>
rowMedianOf<T>
rowPercentileOf<T>