to

inline fun <R> Dataset<*>.to(): Dataset<R>

(Kotlin-specific) Returns a new Dataset where each record has been mapped on to the specified type. The method used to map columns depend on the type of R:

  • When R is a class, fields for the class will be mapped to columns of the same name (case sensitivity is determined by spark.sql.caseSensitive).

  • When R is a tuple, the columns will be mapped by ordinal (i.e. the first column will be assigned to _1).

  • When R is a primitive type (i.e. String, Int, etc.), then the first column of the DataFrame will be used.

If the schema of the Dataset does not match the desired R type, you can use Dataset.select/selectTyped along with Dataset.alias or as/to to rearrange or rename as required.

Note that as/to only changes the view of the data that is passed into typed operations, such as map, and does not eagerly project away any columns that are not present in the specified class.

See also

as alias for to