Access APIs

Edit pageLast modified: 24 July 2025

By nature, data frames are dynamic objects; column labels depend on the input source and new columns can be added or deleted while wrangling. Kotlin, in contrast, is a statically typed language where all types are defined and verified ahead of execution.

That's why creating a flexible, handy, and, at the same time, safe API to a data frame is tricky.

In the Kotlin DataFrame library, we provide two different ways to access columns

List of Access APIs

Here's a list of all APIs in order of increasing safety.

String API
Columns are accessed by string representing their name. Type-checking is done at runtime, name-checking too.
Extension Properties API
Extension access properties are generated based on the dataframe schema. The name and type of properties are inferred from the name and type of the corresponding columns.

Example

Here's an example of how the same operations can be performed via different Access APIs:

note
In the most of the code snippets in this documentation there's a tab selector that allows switching across Access APIs.

String API

Extension Properties API

DataFrame.read("titanic.csv")
    .add("lastName") { "name"<String>().split(",").last() }
    .dropNulls("age")
    .filter {
        "survived"<Boolean>() &&
            "home"<String>().endsWith("NY") &&
            "age"<Int>() in 10..20
    }

val df /* : AnyFrame */ = DataFrame.read("titanic.csv")

df.add("lastName") { name.split(",").last() }
    .dropNulls { age }
    .filter { survived && home.endsWith("NY") && age in 10..20 }

The titanic.csv file can be found here.

The String API is the simplest and unsafest of them all. The main advantage of it is that it can be used at any time, including when accessing new columns in chain calls. So we can write something like:

df.add("weight") { ... } // add a new column `weight`, calculated by some expression
    .sortBy("weight") // sorting dataframe rows by its value

In contrast, generated extension properties form the most convenient and the safest API. Using them, you can always be sure that you work with correct data and types. However, there's a bottleneck at the moment of generation. To get new extension properties, you have to run a cell in a notebook, which could lead to unnecessary variable declarations. Currently, we are working on a compiler plugin that generates these properties on the fly while typing!

API	Type-checking	Column names checking	Column existence checking
String API	Runtime	Runtime	Runtime
Extension Properties API	Generation-time	Generation-time	Generation-time

Access APIs﻿

List of Access APIs﻿

Example﻿

note

Access APIs

List of Access APIs

Example