Working with Data Schemas
The Kotlin DataFrame library provides typed data access via generation of extension properties for type DataFrame<T>
, where T
is a marker class that represents DataSchema
of DataFrame
.
Schema of DataFrame
is a mapping from column names to column types of DataFrame
. It ignores order of columns in DataFrame
, but tracks column hierarchy.
In Jupyter environment compile-time DataFrame
schema is synchronized with real-time data after every cell execution.
In IDEA projects, you can use the Gradle plugin to extract schema from the dataset and generate extension properties.
Popular use cases with Data Schemas
Here's a list of the most popular use cases with Data Schemas.
Data Schemas in Gradle projects
If you are developing a server application and building it with Gradle.DataSchema workflow in Jupyter
If you prefer Notebooks.Schema inheritance
It's worth knowing how to reuse Data Schemas generated earlier.Custom Data Schemas
Sometimes it is necessary to create your own scheme.Use external Data Schemas in Jupyter
Sometimes it is convenient to extract reusable code from Jupyter Notebook into the Kotlin JVM library. Schema interfaces should also be extracted if this code uses Custom Data Schemas.Schema Definitions from SQL Databases in Gradle Project
When you need to take data from the SQL database.Import OpenAPI Schemas in Gradle Project
When you need to take data from the endpoint with OpenAPI Schema.Import Data Schemas, e.g. from OpenAPI, in Jupyter
Similar to importing OpenAPI Data Schemas in Gradle projects, you can also do this in Jupyter Notebooks.