Data Schemas and Extension Properties Troubleshooting
Sometimes you can get an exception with a message containing
This means there is a runtime error while accessing a DataFrame extension property, generated by the Compiler Plugin or in Kotlin Notebook.
Such errors are caused by generating extension properties for data schemas that are not compatible with the DataFrame, DataRow, etc. In most cases, the schema contains columns with an incorrect name or type.
For example:
Read a simple csv with a column of integers:
Possible reasons
Incompatible manually defined data schema
If you define the initial data schema manually, make sure your data schema is compatible with the dataframe.
Use
.cast<Schema>()withverify=truefor verifying theSchemacompatibility.Use special methods for generating a data schema code instead of defining data schema manually. However, you may still need to edit a generated schema — adjust a type nullability or the whole type, since sometimes type can be inferred incorrect if the data sample used for generation is not representative or if there's a bug in this
DataFrame.
Bug in the DataFrame reader
Sometimes a column is created internally with an incorrect KType that doesn't represent actual runtime values. As a result, schema().print(), generateInterfaces, print(columnTypes = true) show misleading type.
This can happen when reading a dataframe from files or databases.
Possible workarounds (for ValueColumns):
Update runtime column type:
Specify the correct type using
.replace {}andValueColumn.changeType():
df.replace { wrongTypeCol }.with { it.asValueColumn().changeType(typeOf<ActualType>()) }Use
.inferType { columns }to infer the correct types for the selected columns from the actual values. Doesn't work for generic types likeMap.
Change the compiler-time schema of a dataframe
You need to edit (you can view the correct one with .schema().print()) or regenerate the correct data schema (with generate..() methods) manually and apply it (with cast() or convertTo()).
Problems with type affinity in SQLite
Because of SQLite type affinity, the column typed defined by JDBC may differ from the actual values in the column. This problem often occurs when reading data from an SQLite database with column of custom types.
You can provide types for such columns manually: