For now, if you read DataFrame
from a file or URL, you need to define its schema manually. You can do it quickly with generate..()
methods.
Define schemas:
@DataSchema
data class PersonInfo(
val age: Int,
val height: Float
)
@DataSchema
data class Person(
val info: PersonInfo,
val name: String
)
Read the DataFrame
from the CSV file and specify the schema with .convertTo()
or cast()
:
val df = DataFrame.readCsv("example.csv").convertTo<Person>()
Extensions for this DataFrame
will be generated automatically by the plugin, so you can use extensions for accessing columns, using it in operations inside the Column Selector DSL and DataRow API.
df.info.age
df.sortBy { name and info.height }
df.filter { name.startsWith("A") && info.age >= 16 }
Moreover, new extensions will be generated on-the-fly after each schema change: by changing any column name, or type or add a new one. For example, rename the name
column into "firstName" and then we can use firstName
extensions in the following operations:
df.rename { name }.into("firstName")
.filter { firstName == "Nikita" }
See Compiler Plugin Example IDEA project with basic Extension Properties API examples.