Output decorators¶
Output decorators are used to persist the output of the decorated function in multiple possible formats - table, delta, csv, json and parquet.
@dp.table_overwrite¶
@dp.table_overwrite(identifier: str, table_schema: dp.TableSchema = None, recreate_table: bool = False, options: dict = None
)
Overwrites data in a table with a DataFrame returned by the decorated function
Parameters:
identifier
: str - table nametable_schema
: dp.TableSchema, default None - TableSchema object which defines fields, primary_key, partition_by and tbl_properties, ifNone
the table is saved with the DataFrame schemarecreate_table
: bool, default False, ifTrue
the table is dropped if exists before written tooptions
: dict, default None - options which are passed todf.write.options(**options)
@dp.table_append¶
@dp.table_append(identifier: str, table_schema: dp.TableSchema = None, options: dict = None
)
Appends data to a table with a DataFrame returned by the decorated function
Parameters:
identifier
: str - table nametable_schema
: dp.TableSchema, default None - TableSchema object which defines fields, primary_key, partition_by and tbl_properties, ifNone
the table is saved with the DataFrame schemaoptions
: dict, default None - options which are passed todf.write.options(**options)
@dp.table_upsert¶
@dp.table_upsert(identifier: str, table_schema: dp.TableSchema
)
Updates data or inserts new data to a table based on primary key with a DataFrame returned by the decorated function
Parameters:
identifier
: str - table nametable_schema
: dp.TableSchema, default None - TableSchema object which defines fields, primary_key, partition_by and tbl_properties, ifNone
the table is saved with the DataFrame schema
@dp.csv_append¶
@dp.csv_append(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Appends a spark DataFrame to a CSV file
Parameters:
path
: str - path to the CSV filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.csv_overwrite¶
@dp.csv_overwrite(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Overwrites a CSV file by a spark DataFrame
Parameters:
path
: str - path to the CSV filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.csv_write_ignore¶
@dp.csv_write_ignore(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a CSV file if it does not exist
Parameters:
path
: str - path to the CSV filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.csv_write_errorifexists¶
@dp.csv_write_errorifexists(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a CSV file, throws an Exception if it already exists
Parameters:
path
: str - path to the CSV filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.delta_append¶
@dp.delta_append(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Appends a spark DataFrame to a Delta
Parameters:
path
: str - path to the Deltapartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.delta_overwrite¶
@dp.delta_overwrite(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Overwrites a Delta by a spark DataFrame
Parameters:
path
: str - path to the Deltapartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.delta_write_ignore¶
@dp.delta_write_ignore(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a Delta if it does not exist
Parameters:
path
: str - path to the Deltapartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.delta_write_errorifexists¶
@dp.delta_write_errorifexists(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a Delta, throws an Exception if it already exists
Parameters:
path
: str - path to the Deltapartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.json_append¶
@dp.json_append(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Appends a spark DataFrame to a json file
Parameters:
path
: str - path to the json filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.json_overwrite¶
@dp.json_overwrite(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Overwrites a json file by a spark DataFrame
Parameters:
path
: str - path to the json filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.json_write_ignore¶
@dp.json_write_ignore(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a json file if it does not exist
Parameters:
path
: str - path to the json filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.json_write_errorifexists¶
@dp.json_write_errorifexists(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a json file, throws an Exception if it already exists
Parameters:
path
: str - path to the json filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.parquet_append¶
@dp.parquet_append(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Appends a spark DataFrame to a parquet file
Parameters:
path
: str - path to the parquet filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.parquet_overwrite¶
@dp.parquet_overwrite(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Overwrites a parquet file by a spark DataFrame
Parameters:
path
: str - path to the parquet filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.parquet_write_ignore¶
@dp.parquet_write_ignore(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a parquet file if it does not exist
Parameters:
path
: str - path to the parquet filepartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)
@dp.parquet_write_errorifexists¶
@dp.parquet_write_errorifexists(path: str, partition_by: Union[str, list] = None, options: dict = None
)
Saves a spark DataFrame to a parquet, throws an Exception if it already exists
Parameters:
path
: str - path to the parquetpartition_by
: Union[str, list], default None - Union[str, list], default None - one or multiple fields to partition the data byoptions
: dict, default None - options passed todf.write.options(**options)