DataFrame 1.0 Help

OpenAPI

Kotlin DataFrame provides support for reading and writing JSON data that conforms to OpenAPI 3.0 specifications. This feature is useful when working with APIs that expose structured data defined via OpenAPI schemas.

openapi modules

About the dataframe-openapi/dataframe-openapi-generator modules:

  • You only need dataframe-openapi-generator module to generate DataSchema interfaces and helper functions for reading JSON. This does not need to be included in your published artifact.

  • You do need to include dataframe-openapi module in your artifact, as it contains some small helper functions that are used when reading JSON using the generated DataSchema interfaces.

These are not included in the general dataframe artifact due to their experimental status.

HTTP API calls

DataFrame uses only the types inside the OpenAPI spec, not the API paths. If you want to use all OpenAPI has to offer, check out OpenAPI generator support in IntelliJ IDEA, which can use a multitude of libraries, like Ktor to handle the API calls.

If you get raw API results in JSON format, you can convert the results to DataFrame-generated data schema types like:

MyName.SomeOpenApiType.readJsonStr(text = rawJsonArrayString): DataFrame<SomeOpenApiType>

See Examples for how to generate these.

If you receive maps or lists of objects, those can easily be converted to a dataframe too.

You can still perform simple GET calls with DataFrame functions, like MyName.SomeOpenApiType.readJson("some/api/url"), see Examples.

Examples

Here is an example showing how to turn an OpenAPI spec into usable data schemas:

val url = "https://petstore3.swagger.io/api/v3/openapi.json" val code = OpenApi().readCodeForGeneration( stream = URI(url).toURL().openStream(), name = "PetStore", extensionProperties = false, // optional, only needed without compiler plugin generateHelperCompanionObject = false, // optional, used inside notebooks ) println(code)

This uses the generic DataFrame SupportedCodeGenerationFormat interface, which is also used by importDataSchema() in notebooks.

To provide some Swagger-parser-specific arguments, like auth and options, use:

val url = "https://petstore3.swagger.io/api/v3/openapi.json" val code = readOpenApi( uri = url, name = "PetStore", extensionProperties = false, // only needed without compiler plugin generateHelperCompanionObject = false, // optional, used inside notebooks auth = null, // optional, if authentication is needed to access the url options = null, // optional, Swagger parse options ) println(code)

or, if you've already read your OpenAPI file as String:

val openApiAsString = URI("https://petstore3.swagger.io/api/v3/openapi.json").toURL().readText() val code = readOpenApiAsString( openApiAsString = openApiAsString, name = "PetStore", extensionProperties = false, // only needed without compiler plugin generateHelperCompanionObject = false, // optional, used inside notebooks auth = null, // optional, if authentication is needed to access the url options = null, // optional, Swagger parse options ) println(code)

Generated code:

interface PetStore { @DataSchema(isOpen = false) interface Order { val id: Long? val petId: Long? val quantity: Int? val shipDate: LocalDateTime? val status: Status? val complete: Boolean? public companion object { public val keyValuePaths: List<JsonPath> get() = listOf() public fun DataFrame<*>.convertToOrder(convertTo: ConvertSchemaDsl<Order>.() -> Unit = {}): DataFrame<Order> = convertTo<Order> { convertDataRowsWithOpenApi() convertTo() } public fun readJson(url: URL): DataFrame<Order> = DataFrame .readJson(url, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToOrder() public fun readJson(path: String): DataFrame<Order> = DataFrame .readJson(path, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToOrder() public fun readJson(stream: InputStream): DataFrame<Order> = DataFrame .readJson(stream, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToOrder() public fun readJsonStr(text: String): DataFrame<Order> = DataFrame .readJsonStr(text, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToOrder() } } enum class Status(override val value: String) : DataSchemaEnum { PLACED("placed"), APPROVED("approved"), DELIVERED("delivered"); } @DataSchema(isOpen = false) interface Category { val id: Long? val name: String? public companion object { public val keyValuePaths: List<JsonPath> get() = listOf() public fun DataFrame<*>.convertToCategory(convertTo: ConvertSchemaDsl<Category>.() -> Unit = {}): DataFrame<Category> = convertTo<Category> { convertDataRowsWithOpenApi() convertTo() } public fun readJson(url: URL): DataFrame<Category> = DataFrame .readJson(url, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToCategory() public fun readJson(path: String): DataFrame<Category> = DataFrame .readJson(path, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToCategory() public fun readJson(stream: InputStream): DataFrame<Category> = DataFrame .readJson(stream, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToCategory() public fun readJsonStr(text: String): DataFrame<Category> = DataFrame .readJsonStr(text, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToCategory() } } @DataSchema(isOpen = false) interface User { val id: Long? val username: String? val firstName: String? val lastName: String? val email: String? val password: String? val phone: String? val userStatus: Int? public companion object { public val keyValuePaths: List<JsonPath> get() = listOf() public fun DataFrame<*>.convertToUser(convertTo: ConvertSchemaDsl<User>.() -> Unit = {}): DataFrame<User> = convertTo<User> { convertDataRowsWithOpenApi() convertTo() } public fun readJson(url: URL): DataFrame<User> = DataFrame .readJson(url, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToUser() public fun readJson(path: String): DataFrame<User> = DataFrame .readJson(path, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToUser() public fun readJson(stream: InputStream): DataFrame<User> = DataFrame .readJson(stream, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToUser() public fun readJsonStr(text: String): DataFrame<User> = DataFrame .readJsonStr(text, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToUser() } } @DataSchema(isOpen = false) interface Tag { val id: Long? val name: String? public companion object { public val keyValuePaths: List<JsonPath> get() = listOf() public fun DataFrame<*>.convertToTag(convertTo: ConvertSchemaDsl<Tag>.() -> Unit = {}): DataFrame<Tag> = convertTo<Tag> { convertDataRowsWithOpenApi() convertTo() } public fun readJson(url: URL): DataFrame<Tag> = DataFrame .readJson(url, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToTag() public fun readJson(path: String): DataFrame<Tag> = DataFrame .readJson(path, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToTag() public fun readJson(stream: InputStream): DataFrame<Tag> = DataFrame .readJson(stream, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToTag() public fun readJsonStr(text: String): DataFrame<Tag> = DataFrame .readJsonStr(text, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToTag() } } @DataSchema(isOpen = false) interface Pet { val id: Long? val name: String val category: Category? val photoUrls: List<String> val tags: DataFrame<Tag?> val status: Status1? public companion object { public val keyValuePaths: List<JsonPath> get() = listOf() public fun DataFrame<*>.convertToPet(convertTo: ConvertSchemaDsl<Pet>.() -> Unit = {}): DataFrame<Pet> = convertTo<Pet> { convertDataRowsWithOpenApi() convertTo() } public fun readJson(url: URL): DataFrame<Pet> = DataFrame .readJson(url, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToPet() public fun readJson(path: String): DataFrame<Pet> = DataFrame .readJson(path, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToPet() public fun readJson(stream: InputStream): DataFrame<Pet> = DataFrame .readJson(stream, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToPet() public fun readJsonStr(text: String): DataFrame<Pet> = DataFrame .readJsonStr(text, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToPet() } } enum class Status1(override val value: String) : DataSchemaEnum { AVAILABLE("available"), PENDING("pending"), SOLD("sold"); } @DataSchema(isOpen = false) interface ApiResponse { val code: Int? val type: String? val message: String? public companion object { public val keyValuePaths: List<JsonPath> get() = listOf() public fun DataFrame<*>.convertToApiResponse(convertTo: ConvertSchemaDsl<ApiResponse>.() -> Unit = {}): DataFrame<ApiResponse> = convertTo<ApiResponse> { convertDataRowsWithOpenApi() convertTo() } public fun readJson(url: URL): DataFrame<ApiResponse> = DataFrame .readJson(url, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToApiResponse() public fun readJson(path: String): DataFrame<ApiResponse> = DataFrame .readJson(path, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToApiResponse() public fun readJson(stream: InputStream): DataFrame<ApiResponse> = DataFrame .readJson(stream, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToApiResponse() public fun readJsonStr(text: String): DataFrame<ApiResponse> = DataFrame .readJsonStr(text, typeClashTactic = ANY_COLUMNS, keyValuePaths = keyValuePaths) .convertToApiResponse() } } }

    Using the generated code:

    val df: DataFrame<PetStore.Pet> = PetStore.Pet.readJson("$baseUrl/pet/10")

      See our json-openapi example project for how to use OpenAPI in combination with DataFrame inside a Gradle project.

      Notebooks

      To enable it in Kotlin Notebook, use:

      %use dataframe(enableExperimentalOpenApi = true)

      See Import Data Schemas, e.g. from OpenAPI, in Kotlin Notebook for details on how to work with OpenAPI-based data in notebooks, as well as the OpenAPI guide notebook

      Some Background around OpenAPI Support in DataFrame

      DataFrame does not (yet) have a go-to system for reading just types to generate data schemas with. Usually, you read a sample of your data into a dataframe, generate the data schema from it, and then use it in your pipeline.

      However, what if your data source already provides its own types, like OpenAPI does for JSON? Surely that will be safer than using a sample.

      This is, for instance, why in dataframe-jdbc we have DataFrameSchema.readSqlTable(): DataFrameSchema which you can print, copy-paste into your project, and use to safely cast your data to.

      However, for OpenAPI, we identified that just generating DataSchema interfaces was not enough to properly read JSON with types defined in an OpenAPI schema. OpenAPI schemas support inheritance, enums, and other complex features that require specialized handling. This is why we have a separate '-generator' module which can generate both DataSchema interfaces, helper functions for reading JSON and any type aliases and enums that are needed to correctly read the JSON according to the OpenAPI schema.

      13 May 2026