Geo Plotting

Kandy-Geo and DataFrame-Geo Usage

To integrate Kandy-Geo and DataFrame-Geo into an interactive notebook, use the following commands:

Latest versions

Specify versions

note
Without specifying %useLatestDescriptors, the version included in the Kotlin Jupyter kernel will be used.

// Fetches the latest versions
%useLatestDescriptors
// Adds both the kandy-geo and the dataframe-geo libraries with the latest versions
%use kandy-geo

tip
Kotlin notebook offers unique features with the dataframe library.

Geometries

All classes for the aforementioned geometries are provided in JTS and inherit from the base class Geometry. GeoDataFrame is a wrapper around a standard DataFrame with a geometry column of type Geometry, enabling convenient handling of geospatial datasets.

Reading GeoDataFrame

Currently, the GeoDataFrame supports two of the most popular formats: Shapefile and GeoJSON. These formats can be read into a GeoDataFrame using the corresponding GeoDataFrame.read..() functions. Each of these functions returns a GeoDataFrame.

GeoJSON

{
  "type": "Feature",
  "geometry": {
    "type": "Point",
    "coordinates": [125.6, 10.1]
  },
  "properties": {
    "name": "Dinagat Islands"
  }
}

val usaStates =
    GeoDataFrame.readGeoJson("https://raw.githubusercontent.com/AndreiKingsley/datasets/refs/heads/main/USA.json")

usaStates.df

This DataFrame is required to have a geometry column of type org.locationtech.jts.geom.Geometry:

usaStates.df.geometry.type()

org.locationtech.jts.geom.Geometry

usaStates.df.geometry.map { it::class }.distinct().toList()

[class org.locationtech.jts.geom.Polygon, class org.locationtech.jts.geom.MultiPolygon]

As expected, these are Polygon and MultiPolygon.

The GeoDataFrame also contains a .crs field for the coordinate reference system (CRS). In GeoJSON, this field is not explicitly defined* and is read as null. If this field is not explicitly set in the GeoDataFrame, it is assumed by default to use WGS84 — the standard CRS for working with geospatial data.

usaStates.crs

null

Shapefile

Shapefile is a popular geospatial vector data format developed by ESRI. It stores geometric features such as points, lines, and polygons, along with their attributes, across multiple files. A Shapefile requires at least three parts: .shp (geometry), .shx (spatial index), and .dbf (attributes), and it typically uses a defined coordinate reference system.

To load a Shapefile, you need to specify the path to the file with the .shp extension. The other required files must be in the same directory and share the same base name.

val worldCities =
    GeoDataFrame.readShapefile("https://github.com/AndreiKingsley/datasets/raw/refs/heads/main/ne_10m_populated_places_simple/ne_10m_populated_places_simple.shp")

worldCities.df

This GeoDataFrame contains only Point geometries:

worldCities.df.geometry.type()

org.locationtech.jts.geom.Point

worldCities.crs

GEOGCS["GCS_WGS_1984",
  DATUM["D_WGS_1984",
    SPHEROID["WGS_1984", 6378137.0, 298.257223563]],
  PRIMEM["Greenwich", 0.0],
  UNIT["degree", 0.017453292519943295],
  AXIS["Longitude", EAST],
  AXIS["Latitude", NORTH]]

Plot

To facilitate this, Kandy-Geo introduces geo layers, which, unlike regular layers, accept geometries. These can be provided as DataFrame columns, Iterable, or single instances. If a layer is built in the context of a GeoDataFrame dataset, it is not necessary to explicitly specify the geometry, as the geometry column will be used by default.

geoPolygon

The geoPolygon() adds a layer of polygons constructed using Polygon and MultiPolygon geometries.

Let's plot US states from usaStates:

usaStates.plot {
    // `geoPolygon` uses polygons and multipolygons
    // from the `geometry` column of `usaStates` inner DataFrame
    geoPolygon()
}

The customization process for such a layer is no different from a regular one. The function optionally opens a block where you can configure all polygon aesthetic attributes as usual using mappings and settings. For example, you can color each state by mapping the name column to fillColor and customize the borderLine as shown below:

usaStates.plot {
    geoPolygon {
        fillColor(name) { legend.type = LegendType.None } // Hide legend
        borderLine {
            width = 0.1
            color = Color.BLACK
        }
    }
}

Mercator coordinates transformation

usaStates.plot {
    geoPolygon()
    coordinatesTransformation = CoordinatesTransformation.mercator()
}

geoMap

geoMap() is a basically geoPolygon() but it also applies coordinates transformation based on the provided CoordinateReferenceSystem (GeoDataFrame.crs). Now only WGS84 is supported (where the mercator projection is applied by default).

// This plot is identical
// to the previous one.
usaStates.plot {
    geoMap()
}

usaStates.plot {
    geoMap()
    x.axis.limits = -127..-65
    y.axis.limits = 23..50
}

geoPoints

The geoPoints() adds a layer of points constructed using Point and MultiPoint geometries.

Let's add worldCities points over usaStates polygons:

usaStates.plot {
    // `geoMap` takes polygons from the `geometry`
    // column of `usaStates` inner DataFrame
    geoMap()
    // Add a new dataset using the `worldCities` GeoDataFrame.
    // Layers created within this scope will use it as their base dataset
    // instead of the initial one
    withData(worldCities) {
        // `geoPoints` takes points from the `geometry`
        // column of `worldCities` inner DataFrame
        geoPoints {
            size = 1.5
        }
    }
}

GeoDataFrame modifying

Before plotting, it is often necessary to modify the geo- dataframe. For example, you might filter points within a specific area, translate or scale certain geometries, and so on. GeoDataFrame allows direct updates to its inner DataFrame using the familiar DataFrame Operations API.

DataFrame operations

The function GeoDataFrame<T>.modify(block: DataFrame<T>.() -> DataFrame<T>): GeoDataFrame<T> opens a new scope where the receiver is the inner DataFrame of this GeoDataFrame. This allows you to perform operations such as filter, take, sort, update, and others directly on it. The function returns a GeoDataFrame with the modified DataFrame resulting from the block, while keeping the CRS unchanged.

Let's filter the points in worldCities, keeping only those located within the US. To do this, we will first combine all polygons from usaStates into a single polygon for convenience:

// import mergePolygons utility
import org.jetbrains.kotlinx.kandy.letsplot.geo.util.mergePolygons

// Experimental function that merges a collection of polygons and
// multipolygons into a single multipolygon
val usaPolygon: MultiPolygon = usaStates.df.geometry.mergePolygons()

plot {
    // `geoPolygon` and `geoMap` can accept a single `Polygon` or `MultiPolygon`
    geoMap(usaPolygon)
}

Now, let's create a GeoDataFrame usaCities containing only the cities located within the United States. To avoid over plotting, we will select the 30 most populous cities. For this, we will modify worldCities:

val usaCities = worldCities.modify {
    // Filter the DataFrame to include only points inside the `usaPolygon`
    filter {
        // `usaPolygon.contains(geometry)` checks if the `geometry` (a Point)
        // from the current row of `worldCities` is within the `usaPolygon`
        usaPolygon.contains(geometry)
    }
        // Take 30 most populous cities.
        // Sort the remaining rows by population size in descending order
        .sortByDesc {
            pop_min
        }
        // Select the top 30 rows.
        .take(30)
}

usaStates.plot {
    geoMap()
    withData(usaCities) {
        geoPoints {
            tooltips(title = value(name)) {
                line("population", value(pop_min))
            }
        }
    }
}

val usa48 = usaStates.modify {
    filter {
        name !in listOf("Alaska", "Hawaii", "Puerto Rico")
    }
}

usa48.plot { geoMap() }

Geometry operations

The DataFrame-Geo library provides Kotlin-style extensions for JTS geometries. For instance, Geometry.translate(x, y) shifts a geometry by a specified vector, while Geometry.scaleAroundCenter(factor) scales a geometry relative to its centroid.

val usaAdjusted = usaStates.modify {
    // Custom extensions for `Geometry` based on JTS API.
    // Scale and move Alaska:
    update { geometry }.where { name == "Alaska" }.with {
        it.scaleAroundCenter(0.5).translate(40.0, -40.0)
    }
        // Move Hawaii and Puerto Rico:
        .update { geometry }.where { name == "Hawaii" }.with { it.translate(65.0, 0.0) }
        .update { geometry }.where { name == "Puerto Rico" }.with { it.translate(-10.0, 5.0) }
}

usaAdjusted.plot { geoMap() }

usa48.plot {
    geoMap()
    withData(usa48.modify {
        update { geometry }.with { it.centroid }
    }) {
        geoPoints()
    }
}

Datasets Join

In geo-plotting, separate datasets are often used—one containing the geometries and others with specific data. To combine them, you can join them using modify. Let's load a DataFrame with the results of the 2024 US presidential election:

val usa2024electionResults =
    DataFrame.readCSV("https://gist.githubusercontent.com/AndreiKingsley/348687222aecc4f0eb39e3d81acd515b/raw/a9914352dbdfb426f9146dda633ee382d936b000/usa_2024_election_states.csv")

usa2024electionResults

And join it to the US states GeoDataFrame:

val usaStatesWithElectionResults = usaAdjusted.modify {
    innerJoin(usa2024electionResults) { name }
}

usaStatesWithElectionResults.df

usaStatesWithElectionResults.plot {
    geoMap {
        fillColor(winner) {
            scale = categorical(
                "Republican" to Color.hex("#CC3333"),
                "Democrat" to Color.hex("#3366CC")
            )
        }
        tooltips(name, winner)
    }
    layout {
        title = "USA 2024 President Election Results"
        size = 700 to 500
        style(Style.Void) {
            legend.position = LegendPosition.Top
        }
    }
}

Applying new CRS

val conusAlbersCrs = CRS.decode("EPSG:5070", true)
val usaAlbers = usa48.applyCrs(conusAlbersCrs)
usaAlbers.crs
println("CRS.equalsIgnoreMetadata(usaAlbers.crs, CRS.decode("EPSG:5070", true)) is ${CRS.equalsIgnoreMetadata(usaAlbers.crs, CRS.decode("EPSG:5070", true))}") // true

PROJCS["NAD83 / Conus Albers",
  GEOGCS["NAD83",
    DATUM["North American Datum 1983",
      SPHEROID["GRS 1980", 6378137.0, 298.257222101, AUTHORITY["EPSG","7019"]],
...

usaAlbers.plot {
    // Polygons will work exactly the same -
    // no special coordinates transformation is applied
    // for GeoDF with unsupported CRS
    geoMap()
}

geoPath

The geoPath() adds a layer of a path constructed using LineString and MultiLineString geometries.

The following function constructs the shortest path on the Earth's surface, known as a great-circle line. A great-circle line represents the shortest distance between two points on a sphere, following the curvature of the Earth. The path is approximated using a LineString with a specified number of points n for precision.

import org.locationtech.jts.geom.*
import kotlin.math.*

fun greatCircleLineString(start: Point, end: Point, n: Int = 100): LineString {
    val factory = GeometryFactory()

    val startLat = Math.toRadians(start.y)
    val startLon = Math.toRadians(start.x)
    val endLat = Math.toRadians(end.y)
    val endLon = Math.toRadians(end.x)

    val deltaLon = endLon - startLon
    val cosStartLat = cos(startLat)
    val cosEndLat = cos(endLat)
    val sinStartLat = sin(startLat)
    val sinEndLat = sin(endLat)
    val a = cosStartLat * cosEndLat * cos(deltaLon) + sinStartLat * sinEndLat
    val angularDistance = acos(a)

    if (angularDistance == 0.0) {
        return factory.createLineString(arrayOf(start.coordinate, end.coordinate))
    }

    val coordinates = mutableListOf<Coordinate>()
    for (i in 0..n) {
        val fraction = i.toDouble() / n
        val sinAngularDistance = sin(angularDistance)
        val A = sin((1 - fraction) * angularDistance) / sinAngularDistance
        val B = sin(fraction * angularDistance) / sinAngularDistance

        val x = A * cosStartLat * cos(startLon) + B * cosEndLat * cos(endLon)
        val y = A * cosStartLat * sin(startLon) + B * cosEndLat * sin(endLon)
        val z = A * sinStartLat + B * sinEndLat

        val lat = atan2(z, sqrt(x * x + y * y))
        val lon = atan2(y, x)

        coordinates.add(Coordinate(Math.toDegrees(lon), Math.toDegrees(lat)))
    }

    return factory.createLineString(coordinates.toTypedArray())
}

This convenient function finds a city in usaCities by name and returns its geometry (point):

fun takeCity(name: String) = usaCities.df.filter { it.name == name }.single().geometry

val newYork = takeCity("New York")
val losAngeles = takeCity("Los Angeles")

val curveNY_LA = greatCircleLineString(newYork, losAngeles)

Now, let's plot this curve using geoPath, overlaying it on top of the state polygons and highlighting the points corresponding to the cities:

usa48.plot {
    geoMap { alpha = 0.5 }
    geoPath(curveNY_LA) { width = 1.5 }
    geoPoints(listOf(newYork, losAngeles)) {
        size = 8.0
        color = Color.RED
    }
}

geoRectangles

The geoRectangles() adds a layer of rectangles constructed using Envelope. The Envelope class represents a rectangular region in the coordinate space, defined by its minimum and maximum coordinates. It is commonly used for bounding boxes, spatial indexing, and efficient geometric calculations.

Let's get usa48 common bounding box:

// `.bounds()` function calculates the minimum bounding box
// of all geometries in the `geometry` column of a `GeoDataFrame`,
// returning it as an `Envelope`
val usa48Bounds: Envelope = usa48.bounds().also {
    // Use JTS API for in-place envelope expansion
    it.expandBy(1.0)
}

usa48.plot {
    geoMap()
    geoRectangles(usa48Bounds) {
        alpha = 0.0
        borderLine {
            width = 2.0
            color = Color.GREY
        }
    }
}

In addition, geoRectangles also works with polygons and multipolygon. In such cases, the bounding box of each geometry will be calculated and used individually:

usa48.plot {
    geoMap()
    geoRectangles()
}

Write GeoDataFrame

A GeoDataFrame can be saved to a file in both GeoJSON and Shapefile formats using the GeoDataFrame.write..(filename) functions.

GeoJSON

usaCities.writeGeoJson("usa_cities.geojson")

GeoDataFrame.readGeoJson("usa_cities.geojson").plot { geoPoints() }

Shapefile

Let's save the GeoDataFrame containing the boundaries of US states, which was initially in GeoJSON format and included both polygons and multipolygons, to a Shapefile. To do this, we will first cast all geometries to MultiPolygon.

// All geometries should be the same type (Shapefile restriction),
// but we have both `Polygon` and `MultiPolygon`.
// Cast them all into MultiPolygons
usa48.modify {
    convert { geometry }.with {
        when (it) {
            // Cast `Polygon` to `MultiPolygon` with a single entity
            is Polygon -> it.toMultiPolygon()
            is MultiPolygon -> it
            else -> error("not a polygonal geometry")
        }
    }
}
    // All files comprising the Shapefile will be saved to
    // a directory named "usa_48" and will have the same base name
    .writeShapefile("usa_48")

GeoDataFrame.readShapefile("usa_48/usa_48.shp").plot { geoMap() }

Geo Plotting﻿

Kandy-Geo and DataFrame-Geo Usage﻿

note

tip

Geometries﻿

Reading GeoDataFrame﻿

GeoJSON﻿

Shapefile﻿

Plot﻿

geoPolygon﻿

Mercator coordinates transformation﻿

geoMap﻿

geoPoints﻿

GeoDataFrame modifying﻿

DataFrame operations﻿

Geometry operations﻿

Datasets Join﻿

Applying new CRS﻿

geoPath﻿

geoRectangles﻿

Write GeoDataFrame﻿

GeoJSON﻿

Shapefile﻿

See also

Geo Plotting

Kandy-Geo and DataFrame-Geo Usage

Geometries

Reading GeoDataFrame

GeoJSON

Shapefile

Plot

geoPolygon

Mercator coordinates transformation

geoMap

geoPoints

GeoDataFrame modifying

DataFrame operations

Geometry operations

Datasets Join

Applying new CRS

geoPath

geoRectangles

Write GeoDataFrame

GeoJSON

Shapefile