Quick Start Guide
Basics
Usage
To integrate Kandy and Dataframe into an interactive notebook, use the following commands:
To include Kandy in your Gradle project, add the following to your dependencies:
Additionally, to use static plots, include the following in your build script:
Data
In Kandy, the primary data model for plotting is a "named data" or "dataframe". This model comprises a set of named columns with values of equal length. The input data should be structured as Map<String, List<*>>
.
Example of a simple dataset in Kandy:
To reference dataset columns in your plots, create pointers for each column. There are several ways to do this:
Plot Creation
In Kandy, to create a plot, you start by calling the plot()
function and passing your dataset as an argument. This function establishes a context within which you can add various layers to your plot.
A layer is essentially a collection of mappings from your data to the visual attributes of the plot, known as aesthetic attributes. These attributes define how your data is represented visually, such as through points, lines, or bars.
Here's an example demonstrating the creation of a plot with a single layer:
Layers, aesthetics, mappings, and scales
Each plot layer is defined by its geometrical entity (or geom), which determines the layer's visual representation. Geoms have associated aesthetic attributes (or aesthetics/aes), which can be either positional (like x
, y
, yMin
, yMax
, middle
) or non-positional (such as color
, size
, width
). Non-positional aesthetics have specific types (e.g., size
is associated with Double
, color
with Color
).
Aesthetic values can be assigned in two ways: setting and mapping.
Setting involves assigning a constant value directly:
Mapping links data column values to aesthetic attributes. This can be defined in various ways:
Scales determine how data values are translated into visual representations. They can be categorical (discrete) or continuous, based on their domain and range types. Continuous scales use limits for their domain and range, while categorical scales use lists of categories and corresponding values. Scales are typed and may include a transform parameter to define a transformation function (linear by default).
Explicitly specifying scales after mapping:
Scales can also be created separately and applied later:
Applying pre-defined scales:
Scale parameters: axis and legend
Guides play a crucial role in interpreting charts. They function as mini-charts for scales, with positional scales using axes as guides and non-positional ones utilizing legends. Each scale comes with a default guide, but you can tailor it to your needs.
Here's how to customize the axis and legend in a plot:
Global X-axis and Y-axis mappings and scales
In Kandy, you can use global mappings for X
and Y
across multiple layers, simplifying the process when these mappings are common. Each layer inherits the global mapping unless overridden locally.
Example with global X and Y mappings:
Configuring axis scales without explicit mapping:
Free scale mechanism in Kandy:
This feature, known as "free scale" allows the setting of scale parameters without direct data mapping. It's particularly useful for sub-positional < aesthetics that don't have individual scales but depend on the parent positional aesthetics scale.
Example of using free scale in a boxplot
:
Raw source mapping
Kandy allows for direct data source mapping in plots, providing an alternative to using dataset column pointers. This method supports mapping from an Iterable
source and offers the option to name these sources for clearer visualization.
Example dataset:
Using raw source mapping:
Kotlin DataFrame API
Kandy integrates seamlessly with the Kotlin DataFrame library, allowing the use of DataFrame
as a data source and DataColumn
for mappings. This integration simplifies the process of creating visualizations, as you can utilize auto-generated property columns without manually creating column pointers.
untitled | manufacturer | model | displ | year | cyl | trans | drv | cty | hwy | fl | class |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | audi | a4 | 18,0 | 1999 | 4 | auto(l5) | f | 18 | 29 | p | compact |
2 | audi | a4 | 18,0 | 1999 | 4 | manual(m5) | f | 21 | 29 | p | compact |
3 | audi | a4 | 2,0 | 2008 | 4 | manual(m6) | f | 20 | 31 | p | compact |
4 | audi | a4 | 2,0 | 2008 | 4 | auto(av) | f | 21 | 30 | p | compact |
5 | audi | a4 | 28,0 | 1999 | 6 | auto(l5) | f | 16 | 26 | p | compact |
Example of using DataFrame with Kandy:
In scenarios where auto-generated property-columns are not available, you can manually create pointers to DataFrame columns:
This integration with Kotlin DataFrame enriches Kandy's plotting capabilities, providing a more streamlined and efficient approach to data visualization in Kotlin projects.
Grouping
The grouping feature in Kandy is particularly useful for plotting collective geometries, where multiple data units are represented by a single geometric object, such as a line
. Grouping can be performed either by providing a grouped dataframe (GroupBy
) as a dataset or directly within the plot DSL.
Here's how you can use grouping in Kandy:
Grouping with a Map as Dataset: You can define your dataset as a map and then use
groupBy
within the plotting DSL to create groups based on a specific column.
Grouping with DataFrame API: If you are using the Kotlin DataFrame API, you can apply grouping on a dataframe.
Advanced Grouping with Color Mapping: For more complex visualizations, you can use grouping along with additional aesthetic mappings like color.
Implicit grouping
Kandy provides a convenient way to perform implicit grouping by utilizing categorical scales. This approach is especially useful when you want to differentiate data points based on a specific category without explicitly defining groups.
Plotting with Implicit Grouping: By specifying a categorical scale for an aesthetic attribute like color, Kandy automatically groups the data based on the categories presented in the data column.
In this example, the color
aesthetic is mapped to the c-type
column, which is a categorical variable. Kandy implicitly groups the data based on the unique values in the c-type
column and assigns different colors to each group. This results in a visually distinct representation for each category, making it easier to differentiate between them in the plot.
Position
The position
parameter plays a crucial role in adjusting the spatial arrangement of objects within a single layer, especially when dealing with grouped data. It controls how objects from different groups are positioned relative to each other.
Here's a breakdown of how position can be applied:
cyl | drv | cty |
---|---|---|
4 | f | 22,068966 |
6 | f | 17,186047 |
4 | 4 | 18,347826 |
6 | 4 | 14,781250 |
8 | 4 | 12,104167 |
Dodge Position: Separates bars side by side, making it easy to compare groups.
Identity Position: Overlays bars directly on top of each other, useful for highlighting overlaps.
Stack Position: Stacks bars on top of each other, ideal for cumulative comparisons.
Experimental
This section of Kandy explores experimental APIs which are subject to change in future versions. They offer innovative features for advanced data visualization, inviting exploration and feedback for further development.
Multiplot
There are three ways to create a multiplot: plotGrid
, plotBunch
and faceting.
Plot Grid
The plotGrid
function allows for arranging multiple plots in a structured grid format, offering a cohesive view of different data visualizations. Example:
Plot Bunch
plotBunch
provides a more flexible approach to multiplot creation, enabling custom placement and sizing for each plot. This is ideal for tailored data presentations. Example:
Faceting
Faceting in Kandy is a powerful feature that splits a single plot into multiple plots based on dataset categories. This method is akin to groupBy in data manipulation, providing detailed insights into subcategories. Faceting methods include:
facetGridX
andfacetGridY
for single-parameter faceting along the X or Y axes, respectively:
facetGrid
for two-parameter faceting along both X and Y axes:
facetWrap
for multi-parameter faceting with additional layout control:
Statistics
Kandy offers a robust API for computing statistics directly within its DSL. This functionality is achieved through the stat
family of functions, which transform raw data into a new dataset with calculated statistics. These statistics become accessible as columns and can be mapped to various plot aesthetics, scaled, or incorporated into tooltips.
Here's an example using a dataset of random observations:
You can create statistical visualizations such as a bin plot with bars and lines:
Kandy also simplifies the creation of standard statistical charts, like histograms, by combining statistical calculations and layer creation into a single function:
You can compare it to a bar chart with the calculation of bins stat:
The histogram
in Kandy is flexible, allowing for custom aesthetic bindings and the use of "stat-bin" statistics for mapping:
This statistical API is also compatible with Iterable
data types, expanding its applicability across various data structures:
Layout
Title, subtitle, caption, and size
You can personalize your plot's layout by configuring the title, subtitle, caption, and overall size. The layout
block provides direct access to these properties, making it easy to create visually compelling charts.
Styles
Styles in Kandy offer extensive customization options for your plot's appearance, including styles for lines, text, backgrounds, and more. You can use a pre-built style or create your own custom style.
To apply a style
:
For example, creating a plot with different styles:
Custom Styles
Kandy's DSL allows you to craft custom styles. You can set parameters for lines, text, backgrounds, etc.,either separately or in-place.
Creating a simple custom style:
Example of a style with blank axes:
Custom scales
Kandy extends the versatility of scales beyond standard options, providing custom color scales like hue
, grey
, brewer
, gradient2
, and gradientN
.
Creating plots with different color scales:
Tooltips
Tooltips in Kandy allow for an interactive exploration of data by displaying additional information about visual objects. These tooltips are established within each layer's context using the tooltips()
method.
While tooltips are enabled by default, Kandy allows for their customization or complete deactivation. This flexibility is achieved by adjusting the hide
flag within the tooltips' settings:
Kandy offers extensive customization for tooltips, enabling users to not only display data from specific columns but also format these data points for better clarity and interpretation:
Kandy allows for further customization of tooltips, including title adjustment, minimum width settings, and fixed positioning:
For more detailed tooltip configurations, Kandy offers a special DSL. This DSL enables manual adjustments of tooltip lines, allowing users to add specific data values and customize the layout:
Export
Kandy's plotting library provides an efficient way to export your plots as images in various formats, including .jpg/.jpeg, .png, .html, and .svg. This feature is facilitated by the save()
extension method.
First, create a plot that you wish to export. Here's an example plot with some basic configurations:
Once your plot is ready, you can export it as an image file. The save()
method allows you to specify the file format, scale, and dpi, and the path where the image will be saved:
To view the exported image, you can use the following code snippet:
In addition to saving as a file, you can also export the plot as a BufferedImage
. This is particularly useful for further manipulation or display within a Kotlin application: