DataFrame 1.0 Help

pivot

Splits the rows of a DataFrame and groups them horizontally into new columns based on values from one or several columns of the original DataFrame.

pivot (inward = true) { pivotColumns } [ .groupBy { indexColumns } | .groupByOther() ] [ .default(defaultValue) ] reducer | aggregator reducer = .minBy { column } | .maxBy { column } | .first [ { rowCondition } ] | .last [ { rowCondition } ] .medianBy { column } | .percentileBy(percentile) { column } |.with { rowExpression } | .values { valueColumns } aggregator = .count() | .matches() | .frames() | .with { rowExpression } | .values { valueColumns } | .aggregate { aggregations } | .<stat> [ { columns } ]

See column selectors for how to select the columns for this operation, pivot + groupBy, pivot reducing, pivot aggregation.

Parameters:

  • inward — if true generated columns are nested inside the original column, otherwise they will be top-level

  • pivotColumns — columns with values for horizontal data grouping and generation of new columns

  • indexColumns — columns with values for vertical data grouping

  • defaultValue — value to fill mismatched pivot-index column pairs

  • valueColumns — columns with output values

df
df.pivot { isHappy }
df.pivot("isHappy")
df.pivot(inward = true) { isHappy }
df.pivot("isHappy", inward = true)

To pivot several columns at once, you can combine them using and or then infix function:

  • and will pivot columns independently

  • then will create column hierarchy from combinations of values from pivoted columns

df.pivot { isHappy and name.firstName }
df.pivot { "isHappy" and "name"["firstName"] }
df.pivot { isHappy then name.firstName }
df.pivot { "isHappy" then "name"["firstName"] }

pivot + groupBy

To create a matrix table that is expanded both horizontally and vertically, apply groupBy transformation passing the columns for vertical grouping. Reversed order of pivot and groupBy will produce the same result.

df.pivot { isHappy }.groupBy { name.firstName } // same as df.groupBy { name.firstName }.pivot { isHappy }
df.pivot("isHappy").groupBy { "name"["firstName"] } // same as df.groupBy { "name"["firstName"] }.pivot("isHappy")

To group by all columns except pivoted, use groupByOther:

df.pivot { isHappy }.groupByOther()
df.pivot("isHappy").groupByOther()

PivotGroupBy supports the same reducing and aggregation operations as Pivot. In addition, PivotGroupBy supports the default and matches functions.

df.groupBy { name.firstName }.pivot { isHappy }.max { age }
df.groupBy { "name"["firstName"] }.pivot("isHappy").max("age")
df.pivot { isHappy }.groupBy { name.firstName }.frames()
df.pivot("isHappy").groupBy { "name"["firstName"] }.frames()

Reducing

A Pivot can be reduced into a DataRow. It means that each Pivot group is collapsed into a single row, and then these rows are concatenated into a single DataRow with Pivot keys as top-level columns or as column groups.

Reducing is a specific case of aggregation.

Step 1: use a reducing method

Use the following functions to collapse each group in a Pivot into a single row:

  • first/last — take the first or last row (optionally, the first or last one that satisfies a predicate) of each group;

  • minBy/maxBy — take the row with the minimum or maximum value of the given RowExpression evaluated on rows within each group;

  • medianBy/percentileBy — take the row with the median or a specific percentile value of the given RowExpression evaluated on rows within each group.

These functions return an instance of ReducedPivot, which is a class serving as a transitional step between performing a reduction on Pivot groups and specifying how the resulting reduced rows should be represented in a resulting DataRow.

Examples of reducing

df.pivot
df.pivot { isHappy }
df.pivot("isHappy")
first
df.pivot { isHappy }.first()
df.pivot("isHappy").first()

The first function can be used with a predicate:

df.pivot { isHappy }.first { age == 30 }
df.pivot("isHappy").first { "age"<Int>() == 30 }
last
df.pivot { isHappy }.last()
df.pivot("isHappy").last()

The last function can be used with a predicate:

df.pivot { isHappy }.last { age > 20 }
df.pivot("isHappy").last { "age"<Int>() > 20 }
minBy
df.pivot { isHappy }.minBy { weight }
df.pivot("isHappy").minBy { "weight"<Int>() }
maxBy
df.pivot { isHappy }.maxBy { age }
df.pivot("isHappy").maxBy { "age"<Int>() }
medianBy
df.pivot { isHappy }.medianBy { weight }
df.pivot("isHappy").medianBy { "weight"<Int>() }
percentileBy
df.pivot { isHappy }.percentileBy(25.0) { weight }
df.pivot("isHappy").percentileBy(25.0) { "weight"<Int>() }

Step 2: transform the resulting reduced rows into a new DataRow

ReducedPivot can be transformed into a new DataRow (either using the original reduced rows or their transformed versions).

To perform this transformation, use one of the following functions:

  • values — creates a new DataRow containing the values from the reduced rows in the selected columns;

  • with — computes a new value for each reduced row using a RowExpression and produces a DataRow containing these computed values.

Each of these functions returns a new DataRow with Pivot keys as top-level columns (or as column groups) and values composed of the reduced results from each group.

Examples of transforming

values
df.pivot { isHappy }.first().values()
df.pivot("isHappy").first().values()
with
df.pivot { isHappy } .maxBy { age } .with { name.firstName + " " + name.lastName }
df.pivot("isHappy") .maxBy("age") .with { "name"["firstName"]<String>() + " " + "name"["lastName"]<String>() }

Aggregation

Each Pivot group can be aggregated — that is, transformed into a new value, DataRow, or DataFrame — and then concatenated into a single DataRow composed of these aggregated results, with pivot keys as top-level columns or as column groups.

The following aggregation methods are available:

  • frames — returns this Pivot as a DataRow with pivot keys as columns (or column groups) and corresponding groups stored as FrameColumns;

  • values — collects values from all rows of each group for the selected columns into a single List (values from column groups are collected into a FrameColumn);

  • count — creates a DataRow with the Pivot key columns containing the number of rows in each corresponding group;

  • with — creates a DataRow containing values computed using a RowExpression across all rows of each group and collected into a single List for every group;

  • aggregate — performs a set of custom aggregations using AggregateDsl, allowing computation of one or more derived values per group;

  • various aggregation statistics.

Each of these methods returns a new DataRow with Pivot keys as top-level columns (or as column groups) and values representing the aggregated results of each group.

Examples of aggregation

frames

df.pivot { isHappy }.frames()
df.pivot("isHappy").frames()

values on Pivot

df.pivot { isHappy }.values { name and age }
df.pivot("isHappy").values("name", "age")
df.pivot { isHappy }.values()
df.pivot("isHappy").values()

count

df.pivot { isHappy }.count()
df.pivot("isHappy").count()

with on Pivot

df.pivot { isHappy }.with { name.lastName }
df.pivot("isHappy").with { "name"["lastName"]<String>() }

aggregate

df.pivot { isHappy }.aggregate { count() into "total" count { age >= 18 } into "adults" median { age } into "median age" min { age } into "min age" maxBy { age }.name into "oldest" }
df.pivot("isHappy").aggregate { count() into "total" count { "age"<Int>() > 18 } into "adults" median("age") into "median age" min("age") into "min age" maxBy("age")["name"] into "oldest" }

If only one aggregation function is used, the column name can be omitted:

df.pivot { isHappy }.aggregate { minBy { age }.name }
df.pivot("isHappy").aggregate { minBy("age")["name"] }

Aggregation statistics

Aggregation statistics are predefined shortcuts for common statistical aggregations such as sum, mean, median, and others. These aggregation operations are applied to each group within a Pivot.

Each function computes a statistic across the rows of a pivot group and returns the result as a new value in the resulting DataRow (for Pivot) or as a new column in the resulting DataFrame (for PivotGroupBy).

The following aggregation statistics are available:

To compute one or several statistics per every pivot group, use the aggregate function.

The functions max, maxOf, and maxFor differ as follows. They all calculate the maximum of values, but:

  • max computes it on the selected columns. If more than one column is selected, for each group it computes one maximum value among all selected columns.

  • maxOf computes it by a row expression: the expression is calculated for each row of the group and the maximum value is returned.

  • maxFor computes it for each of the selected columns within each group. If more than one column is selected, for each group it computes the maximum value for each selected column separately.

Similar logic applies to other statistics.

Direct aggregations

Most common aggregation functions can be computed directly on a Pivot or PivotGroupBy.

When applied to a Pivot (without groupBy), the result is a DataRow with pivot keys as columns and the computed statistic as values.

When applied to a PivotGroupBy, the result is a DataFrame where rows correspond to the grouping keys and columns correspond to the pivot keys, with computed statistics as cell values.

Examples of direct aggregations
max
df.pivot { isHappy }.max { age }
df.pivot("isHappy").max("age")
df.pivot { isHappy }.max { age and weight }
df.pivot("isHappy").max("age", "weight")
df.pivot { isHappy }.maxFor { age and weight }
df.pivot("isHappy").maxFor("age", "weight")
df.pivot { isHappy }.maxOf { if (age < 30) weight else null }
df.pivot("isHappy").maxOf { if ("age"<Int>() < 30) "weight"<Int>() else null }
min
df.pivot { isHappy }.min { age }
df.pivot("isHappy").min("age")
sum
df.pivot { isHappy }.sum { weight }
df.pivot("isHappy").sum("weight")
mean
df.pivot { isHappy }.mean()
df.pivot("isHappy").mean()
df.pivot { isHappy }.mean { age }
df.pivot("isHappy").mean { "age"<Int>() }
std
df.pivot { isHappy }.std { age }
df.pivot("isHappy").std { "age"<Int>() }
median
df.pivot { isHappy }.median { age }
df.pivot("isHappy").median("age")
percentile
df.pivot { isHappy }.percentile(25.0) { age }
df.pivot("isHappy").percentile(25.0) { "age"<Int>() }

The separate flag

By default, when an aggregation function produces several values for a single data group, the column hierarchy in the resulting DataFrame is indexed first by pivot keys and then by the names of aggregated values. To reverse this order so that resulting columns are indexed first by names of aggregated values and then by pivot keys, use the separate=true flag that is available in multi-result aggregation operations, such as aggregate or <stat>For:

df.pivot { isHappy }.maxFor(separate = true) { age and weight }
df.pivot("isHappy").maxFor("age", "weight", separate = true)
df.pivot { isHappy }.aggregate(separate = true) { min { age } into "min age" maxOrNull { weight } into "max weight" }
df.pivot("isHappy").aggregate(separate = true) { min("age") into "min age" maxOrNull("weight") into "max weight" }

Default values for aggregation

By default, any aggregation function will result in a null value for those matrix cells where intersection of column and row keys produced an empty data group. You can specify a default value for any aggregation by using the default infix function. This value will replace all null results of the aggregation function over non-empty data groups as well. To use one default value for all aggregation functions, use default() before aggregation.

df.pivot { isHappy }.groupBy { city }.aggregate { min { age } default 0 }
df.pivot("isHappy").groupBy("city").aggregate { min("age") default 0 }
df.pivot { isHappy }.groupBy { city }.default(0).min()
df.pivot("isHappy").groupBy("city").default(0).min()
df.pivot { isHappy }.groupBy { city }.aggregate { count() into "people" default 0 any { age < 18 } into "hasMinors" default false }
df.pivot("isHappy").groupBy("city").aggregate { count() into "people" default 0 any { "age"<Int>() < 18 } into "hasMinors" default false }

Pivot inside aggregate

pivot can be used inside the aggregate function of groupBy. This allows combining column pivoting with other groupBy aggregations:

df.groupBy { name.firstName }.aggregate { pivot { isHappy }.aggregate(separate = true) { mean { age } into "mean age" count() into "count" } count() into "total" }
df.groupBy { "name"["firstName"] }.aggregate { pivot("isHappy").aggregate(separate = true) { mean("age") into "mean age" count() into "count" } count() into "total" }

pivotCounts

Pivots one or several columns with count aggregation, preserving all other columns of a DataFrame or a GroupBy.

df.pivotCounts { isHappy } // same as df.pivot { isHappy }.groupByOther().count()
df.pivotCounts("isHappy") // same as df.pivot("isHappy").groupByOther().count()
df.groupBy { name }.pivotCounts { city } // same as df.groupBy { name }.pivot { city }.count() // same as df.groupBy { name }.aggregate { pivotCounts { city } }
df.groupBy("name").pivotCounts("city") // same as df.groupBy("name").pivot("city").count() // same as df.groupBy("name").aggregate { pivotCounts("city") }

pivotMatches

Pivots one or several columns with Boolean aggregation, preserving all other columns of DataFrame. Each cell in the resulting pivoted columns contains true if that value was present in the original data for the corresponding row, and false otherwise.

df.pivotMatches { city } // same as df.pivot { city }.groupByOther().matches()
df.pivotMatches("city") // same as df.pivot("city").groupByOther().matches()
df.groupBy { name }.pivotMatches { city } // same as df.groupBy { name }.pivot { city }.matches() // same as df.groupBy { name }.aggregate { pivotMatches { city } }
df.groupBy("name").pivotMatches("city") // same as df.groupBy("name").pivot("city").matches() // same as df.groupBy("name").aggregate { pivotMatches("city") }
27 May 2026