Writes data to a Delta Lake table, creating it if it doesn't exist.
Usage
write_deltalake(
data,
table_or_uri,
mode = c("error", "append", "overwrite", "ignore"),
partition_by = NULL,
name = NULL,
description = NULL,
storage_options = NULL,
schema_mode = NULL,
target_file_size = NULL
)Arguments
- data
Data to write. Can be a data.frame, Arrow Table, Arrow RecordBatch, or any object that can be converted to an Arrow RecordBatchReader via
nanoarrow::as_nanoarrow_array_stream().- table_or_uri
Character. Path to the Delta table (local filesystem or cloud storage URI).
- mode
Character. How to handle existing data. One of:
"error"(default): Fail if table exists."append": Add new data to the table."overwrite": Replace all data in the table."ignore": Do nothing if table exists.
- partition_by
Character vector. Column names to partition by (optional).
- name
Character. Table name for metadata (optional, used when creating new table).
- description
Character. Table description for metadata (optional).
- storage_options
Named list. Storage backend options such as credentials (optional).
- schema_mode
Character. How to handle schema evolution (optional). One of:
"overwrite": Replace the schema with the new schema."merge": Merge the new schema with the existing schema.
- target_file_size
Integer. Target size in bytes for each output file (optional). When set, the writer will try to create files of approximately this size.
Value
A list with write result information:
version: The new version number of the table.num_files: Number of files in the table after write.
Examples
if (FALSE) { # \dontrun{
# Write a data.frame to a new Delta table
df <- data.frame(x = 1:10, y = letters[1:10])
write_deltalake(df, "path/to/delta_table")
# Append data to an existing table
write_deltalake(df, "path/to/delta_table", mode = "append")
# Overwrite existing data
write_deltalake(df, "path/to/delta_table", mode = "overwrite")
# Create a partitioned table
write_deltalake(df, "path/to/delta_table", partition_by = "y")
# Write to Google Cloud Storage
write_deltalake(
df,
"gs://my-bucket/path/to/table",
storage_options = list(google_service_account_path = "path/to/key.json")
)
# Write to S3
write_deltalake(
df,
"s3://my-bucket/path/to/table",
storage_options = list(
aws_access_key_id = "MY_ACCESS_KEY",
aws_secret_access_key = "MY_SECRET_KEY",
aws_region = "us-east-1"
)
)
# Write to Azure Blob Storage
write_deltalake(
df,
"az://my-container/path/to/table",
storage_options = list(
azure_storage_account_name = "MY_ACCOUNT_NAME",
azure_storage_account_key = "MY_ACCOUNT_KEY"
)
)
} # }