Dataframe 0.14 Help

Interop with Collections

Kotlin DataFrame and Kotlin Collection represent two different approaches to data storage:

  • DataFrame stores data by fields/columns

  • Collection stores data by records/rows

Although DataFrame doesn't implement the Collection or Iterable interface, it has many similar operations, such as filter, take, first, map, groupBy etc.

DataFrame has two-way compatibility with Map and List:

Columns, rows, and values of DataFrame can be accessed as List, Iterable and Sequence accordingly:

df.columns() // List<DataColumn> df.rows() // Iterable<DataRow> df.values() // Sequence<Any?>

Interop with data classes

DataFrame can be used as an intermediate object for transformation from one data structure to another.

Assume you have a list of instances of some data class that you need to transform into some other format.

data class Input(val a: Int, val b: Int) val list = listOf(Input(1, 2), Input(3, 4))

You can convert this list into DataFrame using toDataFrame() extension:

val df = list.toDataFrame()

Mark the original data class with DataSchema annotation to get extension properties and perform data transformations.

@DataSchema data class Input(val a: Int, val b: Int) val df2 = df.add("c") { a + b }

After your data is transformed, DataFrame instances can be exported into List of another data class using toList or toListOf extensions:

data class Output(val a: Int, val b: Int, val c: Int) val result = df2.toListOf<Output>()

Converting columns with object instances to ColumnGroup

unfold can be used as toDataFrame() analogue for specific columns inside existing dataframes

Last modified: 27 September 2024