DataColumn
DataColumn represents a column of values. It can store objects of primitive or reference types, or other DataFrame objects.
Properties
name: String— name of the column; should be unique within containing dataframepath: ColumnPath— path to the column; depends on the way column was retrieved from dataframetype: KType— type of elements in the columnhasNulls: Boolean— flag indicating whether column containsnullvaluesvalues: Iterable<T>— column datasize: Int— number of elements in the column
Column kinds
DataColumn instances can be one of three subtypes: ValueColumn, ColumnGroup or FrameColumn
ValueColumn
Represents a sequence of values.
It can store values of primitive (integers, strings, decimals, etc.) or reference types. Currently, it uses List as underlying data storage.
ColumnGroup
Container for nested columns. Used to create column hierarchy.
You can create column groups using the group operation or by splitting inward — see group and split for details.
FrameColumn
Special case of ValueColumn that stores another DataFrame objects as elements.
DataFrame stored in FrameColumn may have different schemas.
FrameColumn may appear after reading from JSON or other hierarchical data structures, or after grouping operations such as groupBy or pivot.