READ_CSV
A table-valued function (TVF) that reads CSV files from Amazon S3. The function can use either a location object (recommended) or direct credentials to access the data. READ_CSV
returns a table with data from the specified CSV file, where each cell is read as TEXT
.
Syntax
-- Using LOCATION object (recommended)
READ_CSV (
LOCATION => location_name
[, COMPRESSION => <file_compression>]
[, HEADER => { TRUE | FALSE }]
[, DELIMITER => <field_delimiter>]
[, QUOTE => { "'" | '"' | SINGLE_QUOTE | DOUBLE_QUOTE}]
[, NULL_STRING => <null_string>]
[, ESCAPE => <escape_character>]
[, SKIP_BLANK_LINES => { TRUE | FALSE }]
[, EMPTY_FIELD_AS_NULL => { TRUE | FALSE }]
)
|
-- Using static credentials
READ_CSV (
URL => <url>
[, COMPRESSION => <file_compression>]
[, AWS_ACCESS_KEY_ID => <aws_access_key_id>]
[, AWS_SECRET_ACCESS_KEY => <aws_secret_access_key>]
[, AWS_SESSION_TOKEN => <aws_session_token>]
[, AWS_ROLE_ARN => <aws_role_arn>]
[, AWS_ROLE_EXTERNAL_ID => <aws_role_external_id>]
[, HEADER => { TRUE | FALSE }]
[, DELIMITER => <field_delimiter>]
[, QUOTE => { "'" | '"' | SINGLE_QUOTE | DOUBLE_QUOTE}]
[, NULL_STRING => <null_string>]
[, ESCAPE => <escape_character>]
[, SKIP_BLANK_LINES => { TRUE | FALSE }]
[, EMPTY_FIELD_AS_NULL => { TRUE | FALSE }]
)
Parameters
Parameter | Description | Supported input types |
---|---|---|
LOCATION | The name of a location object that contains the Amazon S3 URL and credentials. Firebolt recommends using LOCATION to store credentials for authentication. See CREATE LOCATION for details. | IDENTIFIER |
URL | The location containing your files in an Amazon S3 bucket. The expected format is s3://{bucket_name}/{full_file_path_glob_pattern} . | TEXT |
COMPRESSION | The compression type of the input file. If compression is not set, compression is inferred from the file extension. | TEXT |
AWS_ACCESS_KEY_ID | The AWS access key ID. | TEXT |
AWS_SECRET_ACCESS_KEY | The AWS secret access key. | TEXT |
AWS_SESSION_TOKEN | The AWS session token. | TEXT |
AWS_ROLE_ARN | The AWS role arn. | TEXT |
AWS_ROLE_EXTERNAL_ID | The AWS role external ID. | TEXT |
HEADER | Set to TRUE if the first row of the CSV file contains a header row containing the column names. | BOOL |
DELIMITER | Specify the character used to separate fields. The default delimiter is a comma (, ). | TEXT |
QUOTE | Specify the character used for quoting fields. The default is double quote (" ). If a single quote is specified, the quote character will be set to (' ). Accepts only DOUBLE_QUOTE , SINGLE_QUOTE , ' , or " . | TEXT |
NULL_STRING | Specify the string used to represent NULL values. The default is an empty string, which means that empty strings are interpreted as NULL values. | TEXT |
ESCAPE | Specify the character used to escape special characters. The default character is the quote (' ) character. | TEXT |
SKIP_BLANK_LINES | Set to TRUE to ignore blank lines in the file. | BOOL |
EMPTY_FIELD_AS_NULL | Specify whether empty fields should be interpreted as NULL values. The default is TRUE . If set to FALSE , empty fields are interpreted as empty strings. | BOOL |
Return Type
The result is a table with the data from the CSV file. Each cell is read as a TEXT
.
Examples
Using LOCATION object
Best practice
Firebolt recommends using a LOCATION
object to store credentials for authentication.
The following code example reads a CSV file from the location specified by my_location
, treating the first row as a header containing column names:
SELECT * FROM READ_CSV(
LOCATION => my_location,
HEADER => true
);
Using static credentials
SELECT * FROM READ_CSV(
URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv',
HEADER => true
);
Example
In the following example, the URL
is set as the first positional parameter and reads a CSV file:
SELECT * FROM READ_CSV('s3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv');
Returns:
f0 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | … |
---|---|---|---|---|---|---|---|---|---|
LevelID | GameID | Level | Name | LevelType | NextLevel | MinPointsToPass | MaxPoints | NumberOfLaps | … |
1 | 1 | 1 | Thunderbolt Circuit | FastestLap | 2 | 5 | 20 | 5 | … |
2 | 1 | 2 | Velocity Vale | FirstToComplete | 3 | 15 | 30 | 10 | … |
3 | 1 | 3 | Raceway Ridge | FastestLap | 4 | 25 | 40 | 20 | … |
4 | 1 | 4 | Nitro Narrows | FirstToComplete | 5 | 60 | 100 | 10 | … |
5 | 1 | 5 | Thunder Road | FirstToComplete | 6 | 80 | 150 | 15 | … |
6 | 1 | 6 | Burnout Boulevard | Drift | 7 | 50 | 80 | 8 | … |
7 | 1 | 7 | Speed Street | FastestLap | 8 | 40 | 70 | 7 | … |
8 | 1 | 8 | Racing Ravine | FastestLap | 9 | 60 | 100 | 20 | … |
9 | 1 | 9 | Drift District | Drift | 10 | 100 | 250 | 25 | … |
10 | 1 | 10 | Acceleration Alley | FirstToComplete | null | 200 | 500 | 50 | … |
Example
The following example accepts URL
as a named parameter and reads a CSV file with column names in the first row:
SELECT * FROM READ_CSV(URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv',
HEADER => true);
Returns
LevelID | GameID | Level | Name | LevelType | NextLevel | MinPointsToPass | MaxPoints | NumberOfLaps | … |
---|---|---|---|---|---|---|---|---|---|
1 | 1 | Thunderbolt Circuit | FastestLap | 2 | 5 | 20 | 5 | 20 | … |
2 | 1 | Velocity Vale | FirstToComplete | 3 | 15 | 30 | 10 | 10 | … |
3 | 1 | Raceway Ridge | FastestLap | 4 | 25 | 40 | 20 | 20 | … |
4 | 1 | Nitro Narrows | FirstToComplete | 5 | 60 | 100 | 10 | 10 | … |
5 | 1 | Thunder Road | FirstToComplete | 6 | 80 | 150 | 15 | 15 | … |
6 | 1 | Burnout Boulevard | Drift | 7 | 50 | 80 | 8 | 8 | … |
7 | 1 | Speed Street | FastestLap | 8 | 40 | 70 | 7 | 7 | … |
8 | 1 | Racing Ravine | FastestLap | 9 | 60 | 100 | 20 | 20 | … |
9 | 1 | Drift District | Drift | 10 | 100 | 250 | 25 | 25 | … |
10 | 1 | Acceleration Alley | FirstToComplete | null | 200 | 500 | 50 | 50 | … |
Example
The following example reads a CSV with headers and reads empty values as empty strings, rather than NULL
values:
SELECT * FROM READ_CSV(URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv',
HEADER => true, EMPTY_FIELD_AS_NULL => false);
Returns:
LevelID | GameID | Level | Name | LevelType | … |
---|---|---|---|---|---|
1 | 1 | Thunderbolt Circuit | FastestLap | 2 | … |
2 | 1 | Velocity Vale | FirstToComplete | 3 | … |
3 | 1 | Raceway Ridge | FastestLap | 4 | … |
4 | 1 | Nitro Narrows | FirstToComplete | 5 | … |
5 | 1 | Thunder Road | FirstToComplete | 6 | … |
6 | 1 | Burnout Boulevard | Drift | 7 | … |
7 | 1 | Speed Street | FastestLap | 8 | … |
8 | 1 | Racing Ravine | FastestLap | 9 | … |
9 | 1 | Drift District | Drift | 10 | … |
10 | 1 | Acceleration Alley | FirstToComplete | … |