Dataframe 0.13 Help

DataSchema workflow in Jupyter

After execution of cell

val df = dataFrameOf("name", "age")( "Alice", 15, "Bob", null )

the following actions take place:

  1. Columns in df are analyzed to extract data schema

  2. Empty interface with DataSchema annotation is generated:

@DataSchema interface DataFrameType
  1. Extension properties for this DataSchema are generated:

val ColumnsContainer<DataFrameType>.age: DataColumn<Int?> @JvmName("DataFrameType_age") get() = this["age"] as DataColumn<Int?> val DataRow<DataFrameType>.age: Int? @JvmName("DataFrameType_age") get() = this["age"] as Int? val ColumnsContainer<DataFrameType>.name: DataColumn<String> @JvmName("DataFrameType_name") get() = this["name"] as DataColumn<String> val DataRow<DataFrameType>.name: String @JvmName("DataFrameType_name") get() = this["name"] as String

Every column produces two extension properties:

  • Property for ColumnsContainer<DataFrameType> returns column

  • Property for DataRow<DataFrameType> returns cell value

  1. df variable is typed by schema interface:

val temp = df
val df = temp.cast<DataFrameType>()

To log all these additional code executions, use cell magic

%trackExecution -all
Last modified: 29 March 2024