Write

Writing to CSV

You can write DataFrame in CSV format to file, to String or to Appendable (i.e. to Writer).

df.writeCsv(file)

val csvStr = df.toCsvStr(delimiter = ';', recordSeparator = System.lineSeparator())

Writing to JSON

You can write your DataFrame in JSON format to file, to string or to Appendable (i.e. to Writer).

df.writeJson(file)

val jsonStr = df.toJson(prettyPrint = true)

Write to Excel spreadsheet

implementation("org.jetbrains.kotlinx:dataframe-excel:$dataframe_version")

You can write your DataFrame in XLS, XLSX format to a file, OutputStream or Workbook object.

df.writeExcel(file)

/**
 * Do something with generated sheets. Here we set bold style for headers and italic style for first data column
 */
fun setStyles(sheet: Sheet) {
    val headerFont = sheet.workbook.createFont()
    headerFont.bold = true
    val headerStyle = sheet.workbook.createCellStyle()
    headerStyle.setFont(headerFont)

    val indexFont = sheet.workbook.createFont()
    indexFont.italic = true
    val indexStyle = sheet.workbook.createCellStyle()
    indexStyle.setFont(indexFont)

    sheet.forEachIndexed { index, row ->
        if (index == 0) {
            for (cell in row) {
                cell.cellStyle = headerStyle
            }
        } else {
            row.first().cellStyle = indexStyle
        }
    }
}

// Create a workbook (or use existing)
val wb = WorkbookFactory.create(true)

// Create different sheets from different data frames in the workbook
val allPersonsSheet = df.writeExcel(wb, sheetName = "allPersons")
val happyPersonsSheet =
    df.filter { person -> person.isHappy }.remove("isHappy").writeExcel(wb, sheetName = "happyPersons")
val unhappyPersonsSheet =
    df.filter { person -> !person.isHappy }.remove("isHappy").writeExcel(wb, sheetName = "unhappyPersons")

// Do anything you want by POI
listOf(happyPersonsSheet, unhappyPersonsSheet).forEach { setStyles(it) }

// Save the result
file.outputStream().use { wb.write(it) }
wb.close()

// Create a new Excel workbook with a single sheet called "allPersons", replacing the file if it already exists -> Current sheets: allPersons
df.writeExcel(file, sheetName = "allPersons")
// Add a new sheet to the previous file without replacing it, by setting keepFile = true -> Current sheets: allPersons, happyPersons
df.filter { person -> person.isHappy }.remove("isHappy")
    .writeExcel(file, sheetName = "happyPersons", keepFile = true)
// Add a new sheet to the previous file without replacing it, by setting keepFile = true -> Current sheets: allPersons, happyPersons, unhappyPersons
df.filter { person -> !person.isHappy }.remove("isHappy")
    .writeExcel(file, sheetName = "unhappyPersons", keepFile = true)

Writing to Apache Arrow formats

implementation("org.jetbrains.kotlinx:dataframe-arrow:$dataframe_version")

df.writeArrowIPC(file)
// or
df.writeArrowFeather(file)

val ipcByteArray: ByteArray = df.saveArrowIPCToByteArray()
// or
val featherByteArray: ByteArray = df.saveArrowFeatherToByteArray()

The second approach is a bit more tricky. You have to specify schema itself and casting behavior mode as ArrowWriter parameters. Behavior Mode has four independent switchers: restrictWidening, restrictNarrowing, strictType, strictNullable. You can use Mode.STRICT (this is default), Mode.LOYAL or any combination you want. The ArrowWriter object should be closed after using because Arrow uses random access buffers not managed by Java GC. Finally, you can specify a callback to be invoked if some data is lost or can not be saved according to your schema.

// Get schema from anywhere you want. It can be deserialized from JSON, generated from another dataset
// (including the DataFrame.columns().toArrowSchema() method), created manually, and so on.
val schema = Schema.fromJSON(schemaJson)

df.arrowWriter(

    // Specify your schema
    targetSchema = schema,

    // Specify desired behavior mode
    mode = ArrowWriter.Mode(
        restrictWidening = true,
        restrictNarrowing = true,
        strictType = true,
        strictNullable = false,
    ),

    // Specify mismatch subscriber
    mismatchSubscriber = writeMismatchMessage,

    ).use { writer: ArrowWriter ->

    // Save to any format and sink, like in the previous example
    writer.writeArrowFeather(file)
}

Write﻿

Writing to CSV﻿

Writing to JSON﻿

Write to Excel spreadsheet﻿

Writing to Apache Arrow formats﻿

warning

tip

tip

Writing to CSV

Writing to JSON

Write to Excel spreadsheet

Writing to Apache Arrow formats