DataFrame 1.0 Help

join

Joins two DataFrame objects by join columns.

A join creates a new dataframe by combining rows from two input dataframes according to one or more key columns.
Rows are merged when the values in the join columns match.
If there is no match, whether the row is included and how missing values are filled depends on the type of join (e.g., inner, left, right, full).

Returns a new DataFrame that contains the merged rows and columns from both inputs.

join(otherDf, type = JoinType.Inner) [ { joinColumns } ] joinColumns: JoinDsl.(LeftDataFrame) -> Columns interface JoinDsl: LeftDataFrame { val right: RightDataFrame fun DataColumn.match(rightColumn: DataColumn) }

joinColumns is a special case of columns selector that defines column mapping for join.

Related operations: Multiple DataFrames

Examples

Join with explicit keys (with different names)

Use the Join DSL when the key column names differ:

  • access the right DataFrame via right;

  • define the join condition with match.

dfAges
dfCities
// INNER JOIN on differently named keys: // Merge a row when dfAges.firstName == dfCities.name. // With the given data all 3 names match → all rows merge. dfAges.join(dfCities) { firstName match right.name }

Join with explicit keys (with the same names)

If mapped columns have the same name, just select join columns (one or several) from the left DataFrame:

dfLeft
dfRight
// INNER JOIN on "name" only: // Merge when left.name == right.name. // Duplicate keys produce multiple merged rows (one per pairing). dfLeft.join(dfRight) { name }

Join with implicit keys (all columns with the same name)

If joinColumns is not specified, columns with the same name from both DataFrame objects will be used as join columns:

dfLeft
dfRight
// INNER JOIN on all same-named columns ("name" and "city"): // Merge when BOTH name AND city are equal; otherwise the row is dropped. dfLeft.join(dfRight)

Join types

Supported join types:

  • Inner (default) — only matched rows from left and right DataFrame objects

  • Filter — only matched rows from left DataFrame

  • Left — all rows from left DataFrame, mismatches from right DataFrame filled with null

  • Right — all rows from right DataFrame, mismatches from left DataFrame filled with null

  • Full — all rows from left and right DataFrame objects, any mismatches filled with null

  • Exclude — only mismatched rows from left DataFrame

For every join type there is a shortcut operation:

df.innerJoin(otherDf) [ { joinColumns } ] df.filterJoin(otherDf) [ { joinColumns } ] df.leftJoin(otherDf) [ { joinColumns } ] df.rightJoin(otherDf) [ { joinColumns } ] df.fullJoin(otherDf) [ { joinColumns } ] df.excludeJoin(otherDf) [ { joinColumns } ]

Examples

Inner

dfLeft
dfRight
// INNER JOIN: // Combines columns from the left and right dataframes // and keep only rows where (name, city) matches on both sides. dfLeft.innerJoin(dfRight) { name and city }

Filter

dfLeft
dfRight
// FILTER JOIN: // Keep ONLY left rows that have ANY match on (name, city). // No right-side columns are added. dfLeft.filterJoin(dfRight) { name and city }

Left

dfLeft
dfRight
// LEFT JOIN: // Keep ALL left rows and add columns from the right dataframe. // If (name, city) matches, attach right columns values from // the corresponding row in the right dataframe; // if not (e.g. ("Bob", "Dubai") row), fill them with `null`. dfLeft.leftJoin(dfRight) { name and city }
dfLeft
dfRight
// RIGHT JOIN: // Keep ALL right rows and add columns from the left dataframe. // If (name, city) matches, attach left columns values from // the corresponding row in the left dataframe; // if not (e.g. ("Bob", "Tokyo") row), fill them with `null`. dfLeft.rightJoin(dfRight) { name and city }

Full

dfLeft
dfRight
// FULL JOIN: // Keep ALL rows from both sides. Where there's no match on (name, city), // the other side is filled with nulls. dfLeft.fullJoin(dfRight) { name and city }

Exclude

dfLeft
dfRight
// EXCLUDE JOIN: // Keep ONLY left rows that have NO match on (name, city). // Useful to find "unpaired" left rows. dfLeft.excludeJoin(dfRight) { name and city }
14 October 2025