Frequently Asked Questions
Here's a list of frequently asked questions about Kotlin DataFrame.
If you haven’t found an answer to yours, feel free to ask it on:
#datascience channel in Kotlin Slack (request an invite).
Kotlin DataFrame is an official open-source Kotlin framework written in pure Kotlin for working with tabular data.
Its goal is to reconcile Kotlin’s static typing with the dynamic nature of data,
providing a flexible and convenient idiomatic DSL for working with data in Kotlin.
Not yet — Kotlin DataFrame currently supports only the JVM target.
We’re actively exploring multiplatform support.
To stay updated on progress, subscribe to the corresponding issue.
Yes — Kotlin DataFrame can be used in Android projects.
There is no dedicated Android artifact yet, but you can include the standard JVM artifact
by setting up a custom Gradle configuration.
If you're new to Kotlin DataFrame, the Quickstart guide is the perfect place to begin —
it gives a brief yet comprehensive introduction to the basics of working with DataFrame.
You can also check out other guides and examples
to explore various use cases and deepen your understanding of Kotlin DataFrame.
For the best experience, Kotlin DataFrame is most effective in an interactive environment.
Kotlin Notebook is ideal for exploring Kotlin DataFrame.
Everything works out of the box — interactivity, rich rendering of DataFrames and plots.
You can instantly see the results of each operation, view the contents of your DataFrames after every transformation,
inspect individual rows and columns, and explore data step-by-step in a live and interactive way.
See the Quickstart Guide to get started quickly.Kotlin DataFrame Compiler Plugin for IDEA projects enhances your usual IntelliJ IDEA Kotlin projects by enabling compile-time
extension properties generation.
This allows you to work with DataFrames in a name- and type-safe manner,
integrating seamlessly with the IDE.
No, DataFrame
is a completely immutable structure.
Kotlin DataFrame follows the functional style of Kotlin —
each operation that modifies the data returns a new, updated DataFrame
instance.
This means original data is never changed in-place, which improves code safety.
DataFrame
integrates seamlessly with Kotlin collections.
You can:
Create a
DataFrame
from aMap
usingtoDataFrame()
.Convert a
DataFrame
back to aMap
usingtoMap()
.Create a
DataColumn
from aList
usingtoColumn()
.Convert a
DataColumn
to aList
of values.Convert a
DataFrame<T>
into aList<T>
of data class instances corresponding to each row
usingtoList()
.
No! You can store values of any Kotlin or Java types inside a DataFrame
and work with them in a type-safe manner using extension properties
across various operations.
For some commonly used types — such as
Kotlin basic types and
Kotlin date-time types—
there is built-in support for automatic conversion and parsing.
Kotlin DataFrame supports all popular data sources — CSV, JSON, Excel, Apache Arrow, SQL databases, and more!
See the Data Sources section for a complete list of supported formats
and instructions on how to integrate them into your workflow.
Some sources — such as Apache Spark, Exposed,
and Multik — are not supported directly (yet),
but you can find official integration examples here.
If the data source you need isn't supported yet,
feel free to open an issue
and describe your use case — we’d love to hear from you!
These are extension properties — one of the key features of Kotlin DataFrame.
Extension properties correspond to the columns of a DataFrame
, allowing you to access and select them in a type-safe and name-safe way.
They are generated automatically when working with Kotlin DataFrame in:
Kotlin Notebook, where extension properties are generated after each cell execution.
A Kotlin project in IntelliJ IDEA with the Kotlin DataFrame Compiler Plugin enabled, where the properties are generated at compile time.
The KProperty API was a useful access mechanism in earlier versions.
However, with the introduction of extension properties and the Kotlin DataFrame compiler plugin, you now have a more flexible and powerful alternative.
Annotate your Kotlin class with @DataSchema
,
and the plugin will automatically generate type-safe extension properties for your DataFrame
. Or alternatively, call toDataFrame()
on a list of Kotlin or Java objects, and the resulting DataFrame
will have schema according to their properties or getters.
Kandy is a Kotlin plotting library
designed to integrate seamlessly with Kotlin DataFrame.
It provides a convenient and idiomatic Kotlin DSL for building charts,
leveraging all Kotlin DataFrame features — including extension properties.
See the Kandy Quick Start Guide and explore the Examples Gallery.
Yes, Kotlin DataFrame is designed to work with hierarchical data.
You can read JSON or any other nested format into a DataFrame
with hierarchical structure — using FrameColumn
(a column of data frames) and ColumnGroup
(a column with nested subcolumns).
Both dataframe schemas and extension properties fully support nested data structures, allowing type-safe access and transformations at any depth.
See Hierarchical data structures for more information.
Also, you can transform your data into grouped structures using groupBy
or pivot
.
Yes — the experimental dataframe-openapi
module adds support for OpenAPI JSON schemas.
You can use it to parse and work with OpenAPI-defined structures directly in Kotlin DataFrame.
See the OpenAPI Guide for details and examples.
Yes — the experimental dataframe-geo
module provides functionality for working with geospatial data,
including support for reading and writing GeoJSON and Shapefile formats, as well as tools for manipulating geometry types.
See the GeoDataFrame Guide for details and examples with beautiful Kandy geo visualizations.
warning
The current Gradle plugin is under consideration for deprecation and may be officially marked as deprecated in future releases.
The KSP plugin is not compatible with KSP2 and may not work properly with Kotlin 2.1 or newer.
At the moment, data schema generation is handled via dedicated methods instead of relying on the plugins. See Migration from Gradle/KSP Plugin.
All these plugins relate to working with dataframe schemas, but they serve different purposes:
Gradle Plugin and KSP Plugin are used to generate data schemas from external sources as part of the Gradle build process.
Gradle Plugin: You declare the data source in your
build.gradle.kts
file
using thedataframes { ... }
block.KSP Plugin: You annotate your Kotlin file with
@ImportDataSchema
file annotation,
and the schema will be generated via Kotlin Symbol Processing.
See Data Schemas in Gradle Projects for more.
Compiler Plugin provides on-the-fly generation of extension properties based on an existing schema during compilation, and updates the
DataFrame
schema seamlessly after operations. However, when reading data from files or external sources (like SQL), the initialDataFrame
schema cannot be inferred automatically — you need to specify it manually or generate it using thegenerate..()
methods.
We’re always happy to receive contributions!
If you’d like to contribute, please refer to our
contributing guidelines.
To report bugs or suggest improvements, open an issue on the
DataFrame GitHub repository.
You’re also welcome to ask questions or discuss anything related to Kotlin DataFrame in the
#datascience channel on Kotlin Slack.
If you’re not yet a member, you can request an invitation.