KSparkSession

class KSparkSession(val spark: SparkSession)

This wrapper over SparkSession which provides several additional methods to create org.apache.spark.sql.Dataset.

Parameters

spark

The current SparkSession to wrap

Constructors

Link copied to clipboard
constructor(spark: SparkSession)

Properties

Link copied to clipboard
val sc: JavaSparkContext

Lazy instance of JavaSparkContext wrapper around sparkContext.

Link copied to clipboard
Link copied to clipboard
Link copied to clipboard
val udf: UDFRegistration

A collection of methods for registering user-defined functions (UDF).

Functions

Link copied to clipboard
inline fun <T> dfOf(vararg arg: T): Dataset<Row>

Utility method to create dataframe from *array or vararg arguments

inline fun <T> dfOf(colNames: Array<String>, vararg arg: T): Dataset<Row>

Utility method to create dataframe from *array or vararg arguments with given column names

Link copied to clipboard
inline fun <T> dsOf(vararg arg: T): Dataset<T>

Utility method to create dataset from vararg arguments.

Link copied to clipboard
inline fun <T> emptyDataset(): Dataset<T>

Creates new empty dataset of type T.

Link copied to clipboard
fun <T> rddOf(vararg elements: T, numSlices: Int = sc.defaultParallelism()): JavaRDD<T>

Utility method to create an RDD from a list. NOTE: T must be Serializable.

Link copied to clipboard
inline fun <T> Array<T>.toDF(vararg colNames: String): Dataset<Row>

Utility method to create dataframe from Array.

inline fun <T> List<T>.toDF(vararg colNames: String): Dataset<Row>

Utility method to create dataframe from list.

inline fun <T> JavaRDDLike<T, *>.toDF(vararg colNames: String): Dataset<Row>

Utility method to create Dataset (Dataframe) from JavaRDD. NOTE: T must be Serializable.

inline fun <T> RDD<T>.toDF(vararg colNames: String): Dataset<Row>

Utility method to create Dataset (Dataframe) from RDD. NOTE: T must be Serializable.

Link copied to clipboard
inline fun <T> Array<T>.toDS(): Dataset<T>

Utility method to create dataset from Array.

inline fun <T> List<T>.toDS(): Dataset<T>

Utility method to create dataset from list.

inline fun <T> JavaRDDLike<T, *>.toDS(): Dataset<T>

Utility method to create dataset from JavaRDDLike.

inline fun <T> RDD<T>.toDS(): Dataset<T>

Utility method to create dataset from Scala RDD.

Link copied to clipboard
fun <T> List<T>.toRDD(numSlices: Int = sc.defaultParallelism()): JavaRDD<T>

Utility method to create an RDD from a list. NOTE: T must be Serializable.