Interop with Collections
Kotlin DataFrame and Kotlin Collection represent two different approaches to data storage:
DataFramestores data by fields/columnsCollectionstores data by records/rows
Although DataFrame doesn't implement the Collection or Iterable interface, it has many similar operations, such as filter, take, first, map, groupBy etc.
DataFrame has two-way compatibility with Map and List:
List<T>->DataFrame<T>: toDataFrameDataFrame<T>->List<T>: toListMap<String, List<*>>->DataFrame<*>: toDataFrameDataFrame<*>->Map<String, List<*>>: toMapList<List<T>>->DataFrame<*>: toDataFrame
Columns, rows, and values of DataFrame can be accessed as List, Iterable and Sequence accordingly:
Interop with data classes
DataFrame can be used as an intermediate object for transformation from one data structure to another.
Assume you have a list of instances of some data class that you need to transform into some other format.
You can convert this list into DataFrame using toDataFrame() extension:
Mark the original data class with DataSchema annotation to get extension properties and perform data transformations.
After your data is transformed, DataFrame instances can be exported eagerly into List of another data class using toList or toListOf extensions:
Alternatively, one can create lazy Sequence objects. This avoids holding the entire list of objects in memory as objects are created on the fly as needed.
Converting columns with object instances to ColumnGroup
unfold can be used as toDataFrame() analogue for specific columns inside existing dataframes