Reference and syntax for the COPY TO command that exports or unloads data from Firebolt to Amazon S3.
SELECT
query to Amazon S3 in your specified file format. You can specify the output file format, compression type, and other options to control how the data is written, such as file name conventions and whether to overwrite existing files. Use COPY TO
for backing up query results or integrating Firebolt with other data processing tools.
Parameter | Description |
---|---|
<select_query> | Any valid SELECT statement. |
<location_name> | The name of a location object that contains the Amazon S3 URL and credentials. This is the recommended approach for specifying the destination. See CREATE LOCATION for more information. |
<URL> | The path to an Amazon S3 URL where the query result file or files are saved. For example, s3://my_bucket/my_folder . |
CREDENTIALS | The Amazon S3 credentials for accessing the specified <s3_url> . For more information, see CREDENTIALS. |
TYPE | Specifies the file type to save to Amazon S3. If omitted, CSV is the default. |
COMPRESSION | Specifies the compression algorithm to use for the output files. If omitted, defaults to GZIP. If NONE is specified, exported files are not compressed. For TYPE=PARQUET , COMPRESSION can be set to NONE , the default GZIP , or SNAPPY . Each output file is written in Parquet format, using the specified algorithm to compress each column according to the Parquet specification. This kind of compression is internal to the Parquet file. For TYPE=CSV , TSV , or JSON , COMPRESSION can be set to NONE or the default GZIP . With GZIP , the file is first created in the specified format and then compressed into a GZIP file. |
INCLUDE_QUERY_ID_IN_FILE_NAME | Specifies whether a query ID is included in the file name. Each time the statement runs, Firebolt generates a new query ID. If omitted, defaults to TRUE , and Firebolt saves file names using the pattern <query_id>.[type].gz , for example, 123ABCXY2.csv.gz . This allows Firebolt to generate unique file names by default. If FALSE , and FILE_NAME_PREFIX is not specified, files are exported with a generic output file name, for example, output.csv.gz . |
FILE_NAME_PREFIX | Specifies an optional string to use in the file name. If FILE_NAME_PREFIX is omitted and INCLUDE_QUERY_ID_IN_FILE_NAME is TRUE , the exported file name is in the pattern <query_id><file_string>.<type>.gz . If FILE_NAME_PREFIX is specified and INCLUDE_QUERY_ID_IN_FILE_NAME is set to FALSE , the specified string replaces output and the file is in the pattern <file_string>.<type>.gz . |
SINGLE_FILE | Specifies whether the export should be a single file or multiple files. If omitted, the default is FALSE , and the export is split based on the MAX_FILE_SIZE value. Exported files are appended with _<n> incrementally to indicate the position in series, starting with 0 . For example, the first file in a series might be named 123ABCXY2_0.parquet.gz . If TRUE , only a single file is written. If set to TRUE and the file exceeds MAX_FILE_SIZE , an error occurs. |
MAX_FILE_SIZE | Specifies the max file size in bytes. If omitted, the default value is 16,000,000 bytes (about 16 MB). The maximum file size that can be specified is 5,368,709,120 bytes (5 GB). |
OVERWRITE_EXISTING_FILES | Specifies whether exported files should overwrite existing files of the same name in the specified Amazon S3 URL. If omitted, defaults to FALSE , and files of the same name are not overwritten. |
HEADER | Specifies whether CSV files should contain a header line containing the column names. If omitted, defaults to TRUE . Must be omitted for TSV, JSON, or Parquet files. |
<settings> | List of query-specific settings overrides. |
Parameter | Description |
---|---|
AWS_ACCESS_KEY_ID | Your AWS access key ID |
AWS_SECRET_ACCESS_KEY | Your AWS secret access key |
AWS_SESSION_TOKEN | Optional temporary session token for temporary credentials |
AWS_ROLE_ARN | The ARN of the IAM role to assume |
AWS_ROLE_EXTERNAL_ID | Optional external ID for role assumption |
<s3_url>
that you specify in the COPY TO
statement must be attached to the user or role specified in the CREDENTIALS
clause.
my_export_location
contains the Amazon S3 URL and credentials, making the statement cleaner and more secure.
COPY TO
statement with minimal parameters that specifies an AWS_ROLE_ARN
. Because TYPE
is omitted, the file or files will be written in CSV format, and because COMPRESSION
is omitted, they are compressed using GZIP (*.csv.gz
).
16B903C4206098FD
to the query at runtime. The compressed output is 40 MB, exceeding the default of 16 MB, so Firebolt writes 4 files as shown below.
MAX_FILE_SIZE
(5 GB), an error occurs. AWS access keys are specified for credentials.
FILE_NAME_PREFIX
parameter is specified, Firebolt adds a string to the query ID to form file names. The IAM role specified for CREDENTIALS
also has an external ID configured in AWS, so the ID is specified.
16B90E96716236D6
at runtime and writes files as shown below.
INCLUDE_QUERY_ID_IN_FILE_NAME
is set to FALSE
and a FILE_NAME_PREFIX
is specified. In this case, Firebolt writes files using only the specified FILE_NAME_PREFIX
as the file name. In addition, SINGLE_FILE
and OVERWRITE_EXISTING_FILES
are set to TRUE
so that the Amazon S3 URL always contains a single file with the latest query results.