DataSchema workflow in Jupyter
After execution of cell
val df = dataFrameOf("name", "age")(
"Alice", 15,
"Bob", null,
)
the following actions take place:
Columns in
df
are analyzed to extract data schemaEmpty interface with
DataSchema
annotation is generated:
@DataSchema
interface DataFrameType
Extension properties for this
DataSchema
are generated:
val ColumnsContainer<DataFrameType>.age: DataColumn<Int?> @JvmName("DataFrameType_age") get() = this["age"] as DataColumn<Int?>
val DataRow<DataFrameType>.age: Int? @JvmName("DataFrameType_age") get() = this["age"] as Int?
val ColumnsContainer<DataFrameType>.name: DataColumn<String> @JvmName("DataFrameType_name") get() = this["name"] as DataColumn<String>
val DataRow<DataFrameType>.name: String @JvmName("DataFrameType_name") get() = this["name"] as String
Every column produces two extension properties:
Property for
ColumnsContainer<DataFrameType>
returns columnProperty for
DataRow<DataFrameType>
returns cell value
df
variable is typed by schema interface:
val temp = df
val df = temp.cast<DataFrameType>()
To log all these additional code executions, use cell magic
%trackExecution -all
Last modified: 09 December 2024