Write options
sorted_by
A sort using a column prefix:
df.write.sorted_by("uuid").yt("//sys/spark/examples/test_data")
unique_keys
Uniqueness of a key in a table:
df.write.sorted_by("uuid").unique_keys.yt("//sys/spark/examples/test_data")
optimize_for
A table may be stored in row (lookup) or column (scan) format. The preferred format is selected based on the task:
spark.write.optimize_for("scan").yt("//sys/spark/examples/test_data")
spark.write.optimize_for("lookup").yt("//sys/spark/examples/test_data")
Schema v3
Write tables with schema in type_v3 instead of type_v1. It can be enabled via Spark configuration or write option.
Python example:
df.write.option("write_type_v3", "true")
Dynamic tables
For dynamic tables you should explicitly specify an additional option inconsistent_dynamic_write
with true
value so that you do agree that there is no support for transactional writes to dynamic tables.
Python example:
df.write.option("inconsistent_dynamic_write", "true")