Quick Start Guide

Basics

Usage

To integrate Kandy and DataFrame into an interactive notebook, use the following commands:

Latest versions

Specify versions

note
Without specifying %useLatestDescriptors, the version included in the Kotlin Jupyter kernel will be used.

// Fetches the latest versions
%useLatestDescriptors
// Adds the dataframe library with the latest version
%use dataframe
// Adds the kandy library with the latest version
%use kandy

tip
Kotlin notebook offers unique features with the dataframe library.

Data

In Kandy, the primary data model for plotting is a "named data" or "dataframe". This model comprises a set of named columns with values of equal length. The input data should be structured as Map<String, List<*>>.

val simpleDataset = mapOf(
    "time, ms" to listOf(12, 87, 130, 149, 200, 221, 250),
    "relativeHumidity" to listOf(0.45, 0.3, 0.21, 0.15, 0.22, 0.36, 0.8),
    "flowOn" to listOf(true, true, false, false, true, false, false),
)

// 1. Using the `column()` function with the specified column type and name:
val timeMs = column<Int>("time, ms")
// 2. Utilizing the String API, similar to the method above, but using String invocation:
val humidity = "relativeHumidity"<Double>()
// 3. Delegating an unnamed column - its name will be derived from the variable name:
val flowOn by column<Boolean>()

Plot Creation

In Kandy, to create a plot, you start by calling the plot() function and passing your dataset as an argument. This function establishes a context within which you can add various layers to your plot.

plot(simpleDataset) {
    points {
        // Maps values from the "time, ms" column to the X-axis
        x(timeMs)
        // Maps values from the "relativeHumidity" column to the Y-axis
        y(humidity)
        // Sets the size of the points to 4.5
        size = 4.5
        // Maps values from the "flowOn" column to the color attribute
        color(flowOn)
    }
}

Layers, aesthetics, mappings, and scales

Each plot layer is defined by its geometrical entity (or geom), which determines the layer's visual representation. Geoms have associated aesthetic attributes (or aesthetics/aes), which can be either positional (like x, y, yMin, yMax, middle) or non-positional (such as color, size, width). Non-positional aesthetics have specific types (e.g., size is associated with Double, color with Color).

// Using `.constant()` for positional aes:
x.constant(12.0f)
yMin.constant(10.0)

// Assignment for non-positional aes:
size = 5.0
color = Color.RED

// With `ColumnReference`:
x(timeMs)
size(humidity)
color(flowOn)

// Using raw `String`:
x("time, ms")
size("relativeHumidity")

// Directly providing values, e.g., with `Iterable`:
x(listOf(12, 87, 130, 149, 200, 221, 250))
// Optional source ID can be set:
color(listOf(true, true, false, false, true, false, false), "flow on")

plot(simpleDataset) {
    points {
        x(listOf(12, 87, 130, 149, 200, 221, 250)) {
            scale = continuous(min = 0, max = 270) // Using `min`/`max`
        }
        y(humidity) {
            scale = continuous(0.0..1.0) // Using `Range`
        }
        size(humidity) {
            scale = continuous(range = 5.0..20.0)
        }
        color(flowOn) {
            scale = categorical(true to Color.RED, false to Color.BLUE)
        }
        symbol(flowOn) {
            scale = categorical()// default scale
        }
        alpha = 0.9
    }
}

val xReversedScale = Scale.continuousPos<Int>(transform = Transformation.REVERSE)
val yScale = Scale.continuousPos(0.0..0.9)
val colorScale = Scale.categorical<Color, Boolean>(
    range = listOf(Color.RED, Color.GREEN)
)

plot(simpleDataset) {
    points {
        x(listOf(12, 87, 130, 149, 200, 221, 250)) {
            scale = xReversedScale
        }
        y(humidity) {
            scale = yScale
        }
        color(flowOn) {
            scale = colorScale
        }
        symbol = Symbol.ASTERIX
        size = 6.6
    }
}

Scale parameters: axis and legend

// Creating a plot with customized axis and legend
plot(simpleDataset) {
    points {
        // Configuring the x-axis for time
        x(timeMs) {
            axis.name = "Time from start of counting,\n milliseconds"
        }
        // Configuring the y-axis for humidity
        y(humidity) {
            scale = continuous(0.0..1.0) // Setting scale for humidity
            axis {
                name = "Relative humidity" // Axis label
                breaksLabeled(0.0 to "0%", 0.3 to "30%", 0.6 to "60%", 0.9 to "90%") // Custom axis breaks
            }
        }
        size = 12.0 // Set size of points
        // Configuring the legend for humidity
        color(humidity) {
            scale = continuous()
            legend {
                name = "rel. humidity" // Legend label
                type = LegendType.ColorBar(40.0, 190.0, 15) // Legend type and dimensions
                breaks(format = "e") // Legend breaks format
            }
        }
    }
}

Global X-axis and Y-axis mappings and scales

In Kandy, you can use global mappings for X and Y across multiple layers, simplifying the process when these mappings are common. Each layer inherits the global mapping unless overridden locally.

// Using global mappings for X and Y
plot(simpleDataset) {
    x(timeMs) { scale = continuous(max = 275) }
    y(humidity)
    points {
        // Inherits X and Y from the global context
        size = 4.5
        color(flowOn)
    }
    line {
        // Inherits X, overrides Y
        y(listOf(0.49, 0.39, 0.1, 0.4, 0.8, 0.8, 0.9))
        width = 3.0
        color = Color.RED
    }
}

// Configuring axis scales independently
plot(simpleDataset) {
    points {
        x(listOf(10, 20, 30, 40, 50, 60, 70))
        y(humidity)
        size = 4.5
        color(flowOn)
    }
    x.axis.name = "time, ms"
    y {
        scale = continuous(min = 0.0)
        axis.breaks(format = ".2f")
    }
}

plot(simpleDataset) {
    points {
        x(listOf(10, 20, 30, 40, 50, 60, 70))
        y(humidity)
        size = 4.5
        color(flowOn)
    }
    // Alternate brief notation if we want to set the axis limits (continuous scale)
    x.axis.limits = 0..80
    y.axis.limits = 0.0..1.0
}

Example of using free scale in a boxplot:

// Demonstrating free scale in a boxplot
plot(
    mapOf(
        "x" to listOf("a", "b", "c"),
        // Boxplot sub-positional aesthetics
        "min" to listOf(0.8, 0.4, 0.6),
        "lower" to listOf(0.9, 1.4, 0.8),
        "middle" to listOf(1.5, 2.4, 1.6),
        "upper" to listOf(1.9, 3.4, 1.7),
        "max" to listOf(3.1, 4.4, 2.6),
        "width" to listOf(0.1, 1.4, 14.0)
    )
) {
    boxes {
        x("x"<String>())
        // Sub-y aesthetics with a free scale
        yMin("min"<Double>())
        lower("lower"<Double>())
        middle("middle"<Double>())
        upper("upper"<Double>())
        yMax("max"<Double>())
        fatten = 4.5
        alpha = 0.2
        borderLine.color = Color.hex(0x9A2A2A)
    }
    y.axis {
        name = "weight"
        limits = 0.0..5.0
    }
}

Raw source mapping

Kandy allows for direct data source mapping in plots, providing an alternative to using dataset column pointers. This method supports mapping from an Iterable source and offers the option to name these sources for clearer visualization.

// Sample data with months, number of days, and seasons
val month = listOf(
    "January", "February",
    "March", "April", "May",
    "June", "July", "August",
    "September", "October", "November",
    "December"
)
val numberOfDays = listOf(31, 28, 31, 30, 31, 30, 31, 30, 31, 30, 31, 30)
val season = listOf(
    "winter", "winter",
    "spring", "spring", "spring",
    "summer", "summer", "summer",
    "autumn", "autumn", "autumn",
    "winter"
)

// Plotting using raw source mapping
plot {
    bars {
        // Mapping 'month' directly with a name
        x(month, "month") { scale = categorical() }
        // Mapping 'numberOfDays' directly
        y(numberOfDays, "number of days")
        // Mapping 'season' with color, named source, and categorical scale
        fillColor(season, "season") {
            scale = categorical(
                listOf(Color.BLUE, Color.GREEN, Color.RED, Color.ORANGE),
                listOf("winter", "spring", "summer", "autumn"),
            )
        }
    }
}

Kotlin DataFrame API

Kandy integrates seamlessly with the Kotlin DataFrame library, allowing the use of DataFrame as a data source and DataColumn for mappings. This integration simplifies the process of creating visualizations, as you can utilize auto-generated property columns without manually creating column pointers.

// Reading a CSV file into a DataFrame
val mpgDF =
    DataFrame.readCSV("https://raw.githubusercontent.com/JetBrains/lets-plot-kotlin/master/docs/examples/data/mpg.csv")
// Display the first five rows of the DataFrame
mpgDF.head()

untitled	manufacturer	model	displ	year	cyl	trans	drv	cty	hwy	fl	class
1	audi	a4	18,0	1999	4	auto(l5)	f	18	29	p	compact
2	audi	a4	18,0	1999	4	manual(m5)	f	21	29	p	compact
3	audi	a4	2,0	2008	4	manual(m6)	f	20	31	p	compact
4	audi	a4	2,0	2008	4	auto(av)	f	21	30	p	compact
5	audi	a4	28,0	1999	6	auto(l5)	f	16	26	p	compact

// Show the schema of the DataFrame
mpgDF.schema()

untitled: Int
manufacturer: String
model: String
displ: Double
year: Int
cyl: Int
trans: String
drv: String
cty: Int
hwy: Int
fl: String
class: String

// Create a plot using the DataFrame
val mpgInfoPlot = mpgDF.plot {
    points {
        x(displ) // Auto-generated DataFrame columns
        y(cty) {
            scale = continuous(8..34)
        }
        symbol = Symbol.CIRCLE_FILLED
        color = Color.GREY
        alpha = 0.7
        fillColor(drv)
        size(hwy) {
            scale = continuous(5.0..15.0)
            legend.breaks(listOf(15, 30, 40), format = "d")
        }
    }
}
mpgInfoPlot

// Creating a plot with manual column pointers
val mpgCountPlot = mpgDF.groupBy { drv }.aggregate {
    count() into "count"
}.plot {
    bars {
        x(drv)
        y("count"<Int>())
        alpha = 0.75
    }
}
mpgCountPlot

Grouping

The grouping feature in Kandy is particularly useful for plotting collective geometries, where multiple data units are represented by a single geometric object, such as a line. Grouping can be performed either by providing a grouped dataframe (GroupBy) as a dataset or directly within the plot DSL.

// Dataset with a grouping column
val lineDataset = mapOf(
    "timeG" to listOf(1.0, 2.2, 3.4, 6.6, 2.1, 4.4, 6.0, 1.5, 4.7, 6.7),
    "value" to listOf(112.0, 147.3, 111.1, 200.6, 90.8, 110.2, 130.4, 100.1, 90.0, 121.8),
    "c-type" to listOf("A", "A", "A", "A", "B", "B", "B", "C", "C", "C")
)

// Using grouping in the plot
plot(lineDataset) {
    groupBy("c-type") {
        line {
            x("timeG")
            y("value")
        }
    }
}

// Convert the map to a DataFrame
val lineDF = lineDataset.toDataFrame()

// Apply grouping on the DataFrame
lineDF.groupBy {
    `c-type`
}.plot {
    line {
        x(timeG)
        y(value)
    }
}

// Grouping with additional color mapping
lineDF.plot {
    groupBy(`c-type`) {
        line {
            x(timeG)
            y(value)
            color(`c-type`)
            width = 4.0
        }
    }
}

Implicit grouping

// Plotting with implicit grouping
lineDF.plot {
    line {
        x(timeG)
        y(value)
        // Implicit grouping based on `c-type`
        color(`c-type`)
        width = 4.0
    }
}

In this example, the color aesthetic is mapped to the c-type column, which is a categorical variable. Kandy implicitly groups the data based on the unique values in the c-type column and assigns different colors to each group. This results in a visually distinct representation for each category, making it easier to differentiate between them in the plot.

Position

The position parameter plays a crucial role in adjusting the spatial arrangement of objects within a single layer, especially when dealing with grouped data. It controls how objects from different groups are positioned relative to each other.

val meanCylDf = mpgDF.groupBy { cyl and drv }.mean { cty }
// Display the first five rows of the grouped DataFrame
meanCylDf.head(5)

cyl	drv	cty
4	f	22,068966
6	f	17,186047
4	4	18,347826
6	4	14,781250
8	4	12,104167

val meanCylPlot = meanCylDf.groupBy { cyl }.plot {
    bars {
        x(drv)
        y(cty)
        fillColor(cyl) { legend.breaks(format = "d") }

        position = Position.dodge()
    }
}
meanCylPlot

meanCylDf.groupBy { cyl }.plot {
    bars {
        x(drv)
        y(cty)
        fillColor(cyl) { legend.breaks(format = "d") }
        alpha = 0.4

        position = Position.identity()
    }
}

meanCylDf.groupBy { cyl }.plot {
    bars {
        x(drv)
        y(cty)
        fillColor(cyl) { legend.breaks(format = "d") }

        position = Position.stack()
    }
}

Experimental

Multiplot

There are three ways to create a multiplot: plotGrid, plotBunch and faceting.

Plot Grid

The plotGrid function allows for arranging multiple plots in a structured grid format, offering a cohesive view of different data visualizations. Example:

plotGrid(listOf(mpgInfoPlot, mpgCountPlot, meanCylPlot), nCol = 2)

Plot Bunch

plotBunch provides a more flexible approach to multiplot creation, enabling custom placement and sizing for each plot. This is ideal for tailored data presentations. Example:

plotBunch {
    add(mpgCountPlot, 0, 0, 400, 200)
    add(meanCylPlot, 400, 0, 300, 200)
    add(mpgInfoPlot, 0, 200, 700, 300)
}

Faceting

mpgDF.plot {
    points {
        x(displ)
        y(cty) {
            scale = continuous(8..34)
        }
        symbol = Symbol.CIRCLE_FILLED
        color = Color.GREY
        alpha = 0.7
        fillColor(drv)
        size(hwy) {
            scale = continuous(2.0..10.0)
        }
    }

    layout.size = 750 to 450

    facetGridX(drv, scalesSharing = ScalesSharing.FREE_X)
}

mpgDF.plot {
    points {
        x(displ)
        y(cty) {
            scale = continuous(8..34)
        }
        symbol = Symbol.CIRCLE_FILLED
        color = Color.GREY
        alpha = 0.7
        fillColor(drv)
        size(hwy) {
            scale = continuous(2.0..10.0)
        }
    }

    layout.size = 750 to 450

    facetGridY(cyl, format = "d")
}

mpgDF.plot {
    points {
        x(displ)
        y(cty) {
            scale = continuous(8..34)
        }

        symbol = Symbol.CIRCLE_FILLED
        color = Color.GREY
        fillColor(drv)
        size(hwy) {
            scale = continuous(2.0..10.0)
        }
    }

    layout.size = 750 to 450

    facetGrid(drv, cyl, yFormat = "d")
}

mpgDF.plot {
    points {
        x(displ)
        y(cty) {
            scale = continuous(8..34)
        }

        symbol = Symbol.CIRCLE_FILLED
        color = Color.GREY
        fillColor(drv)
        size(hwy) {
            scale = continuous(2.0..10.0)
        }
    }

    layout.size = 750 to 450

    facetWrap(nCol = 3, scalesSharing = ScalesSharing.FREE) {
        facet(drv)
        facet(cyl, order = OrderDirection.DESCENDING, format = "d")
    }
}

Statistics

Kandy offers a robust API for computing statistics directly within its DSL. This functionality is achieved through the stat family of functions, which transform raw data into a new dataset with calculated statistics. These statistics become accessible as columns and can be mapped to various plot aesthetics, scaled, or incorporated into tooltips.

val random = kotlin.random.Random(42)

val observation = List(1000) { random.nextDouble() }
val observationDataset = dataFrameOf(
    "observations" to observation
)

plot(observationDataset) {
    statBin(obs) {
        bars {
            // Simple mapping
            x(Stat.x)
            // Mapping with the scale
            y(Stat.count) {
                scale = continuous(0..100, transform = Transformation.REVERSE)
            }

            alpha = 0.5

            // Formatting of stat value format
            tooltips {
                // Line with the name of stat (i.e. "count") on the left side and its value on the right side
                line(Stat.count, format = "d")
            }
        }

        line {
            x(Stat.x)
            y(Stat.count)

            width = 2.5
            color = Color.RED
        }
    }
}

val histPlot = plot(observationDataset) {
    histogram(obs)

    layout.title = "`histogram`"
}
histPlot

val binBarPlot = plot(observationDataset) {
    statBin(obs) {
        bars {
            x(Stat.x)
            y(Stat.count)
        }
    }
    layout.title = "`statBin` + `bar`"
}

plotGrid(listOf(histPlot, binBarPlot), 2)

The histogram in Kandy is flexible, allowing for custom aesthetic bindings and the use of "stat-bin" statistics for mapping:

plot(observationDataset) {
    histogram(obs, binsOption = BinsOption.byWidth(0.05), binsAlign = BinsAlign.boundary(0.1)) {
        y(Stat.density)

        fillColor(Stat.count) {
            scale = continuous(Color.GREEN..Color.RED)
        }

        borderLine {
            color = Color.BLACK
            width = 0.3
        }

        tooltips(title = "${value(Stat.x)} ± 0.025") {
            line(Stat.density)
        }
    }
}

This statistical API is also compatible with Iterable data types, expanding its applicability across various data structures:

plotGrid(
    listOf(
    plot {
        statBin(observation) {
            points {
                x(Stat.density)
                y(Stat.x)
            }
        }
    },
    plot {
        histogram(observation)
    }
), 2)

Layout

Title, subtitle, caption, and size

You can personalize your plot's layout by configuring the title, subtitle, caption, and overall size. The layout block provides direct access to these properties, making it easy to create visually compelling charts.

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
    }

    // Can access of layout parameters in this context
    layout {
        subtitle = "plot subtitle"
        caption = "plot caption \n important info"
        size = 800 to 600
    }
    // If you just want to put a title
    layout.title = "PLOT TITLE"
}

Styles

To apply a style:

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
    }
    layout.style(Style.Classic)
}

fun plotWithStyle(style: Style? = null, title: String? = null): org.jetbrains.kotlinx.kandy.ir.Plot {
    return mpgDF.plot {
        points {
            x(cty)
            y(hwy)
        }
        layout {
            style?.let {
                style(it)
            }
            this.title = title
        }
    }
}

plotGrid(
    listOf(
        plotWithStyle(Style.Classic, "\"Classic\" style"),
        plotWithStyle(Style.Grey, "\"Grey\" style"),
        plotWithStyle(Style.Light, "\"Light\" style"),
        plotWithStyle(Style.Minimal, "\"Minimal\" style"),
        plotWithStyle(Style.Minimal2, "\"Minimal2\" style (by default)"),
        plotWithStyle(Style.None, "\"None\" style"),
    ), 2
)

Custom Styles

val redLine = LayoutParameters.line(Color.RED)

val simpleCustomStyle = Style.createCustom {
    // use previously created parameters
    xAxis.line(redLine)
    // set up parameters
    yAxis.line {
        color = Color.RED
        width = 0.3
    }
    // remove ticks on both axes
    axis.ticks {
        blank = true
    }
}

plotWithStyle(simpleCustomStyle)

val blankAxesStyle = Style.createCustom {
    blankAxes()
}
plotWithStyle(blankAxesStyle)

Custom scales

Kandy extends the versatility of scales beyond standard options, providing custom color scales like hue, grey, brewer, gradient2, and gradientN.

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        size = 5.0
        color(drv) {
            scale = categoricalColorHue()
        }
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        size = 5.0
        color(cty) {
            scale = continuousColorGrey()
        }
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        size = 5.0
        color(hwy) {
            scale = continuousColorGradient2(Color.BLUE, Color.ORANGE, Color.RED, 30.0)
        }
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        size = 5.0
        color(cty) {
            scale = continuousColorGradientN(
                gradientColors = listOf(
                    Color.RED, Color.hex("#fa89c7"),
                    Color.rgb(200, 89, 12), Color.LIGHT_GREEN
                )
            )
        }
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        size = 5.0
        color(drv) {
            scale = categoricalColorBrewer(BrewerPalette.Qualitative.Set1)
        }
    }
}

Tooltips

Tooltips in Kandy allow for an interactive exploration of data by displaying additional information about visual objects. These tooltips are established within each layer's context using the tooltips() method.

While tooltips are enabled by default, Kandy allows for their customization or complete deactivation. This flexibility is achieved by adjusting the hide flag within the tooltips' settings:

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        color(drv)
        size = 3.5

        tooltips(enable = false)
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        color(drv)
        size = 3.5

        tooltips(drv, cyl, displ, formats = mapOf(displ to ".1g"))
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        color(drv)
        size = 3.5

        tooltips(
            cyl, displ,
            title = "Car info",
            anchor = Anchor.TOP_RIGHT,
            minWidth = 80.0,
        )
    }
}

mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        color(drv)
        size = 3.5

        tooltips(
            // use column values in the title
            title = "${value(manufacturer)} ${value(model)}",
        ) {
            // standard line with column name and value
            line(displ)
            // solid line
            line(trans.tooltipValue())
            // two-sided line
            line("cty/hwy [mpg]", "${cty.tooltipValue()}/${hwy.tooltipValue()}")
        }
    }
}

Export

Kandy's plotting library provides an efficient way to export your plots as images in various formats, including .jpg/.jpeg, .png, .html, and .svg. This feature is facilitated by the save() extension method.

val plotForExport = mpgDF.plot {
    points {
        x(cty)
        y(hwy)
        color(drv)
        size = 3.5
    }
    layout {
        title = "Plot for export"
        size = 600 to 400
    }
}
plotForExport

Once your plot is ready, you can export it as an image file. The save() method allows you to specify the file format, scale, and dpi, and the path where the image will be saved:

val pathPNG = plotForExport.save("plot.png", scale = 2.5, dpi = 9000, path = "./saved_plots")

javax.imageio.ImageIO.read(java.io.File(pathPNG))

In addition to saving as a file, you can also export the plot as a BufferedImage. This is particularly useful for further manipulation or display within a Kotlin application:

plotForExport.toBufferedImage(scale = 2.5, dpi = 9000)

Quick Start Guide﻿

Basics﻿

Usage﻿

note

tip

Data﻿

Plot Creation﻿

Layers, aesthetics, mappings, and scales﻿

Scale parameters: axis and legend﻿

Global X-axis and Y-axis mappings and scales﻿

Raw source mapping﻿

Kotlin DataFrame API﻿

Grouping﻿

Implicit grouping﻿

Position﻿

note

Experimental﻿

Multiplot﻿

Plot Grid﻿

Plot Bunch﻿

Faceting﻿

Statistics﻿

Layout﻿

Title, subtitle, caption, and size﻿

Styles﻿

Custom Styles﻿

Custom scales﻿

Tooltips﻿

Export﻿

See also