A table-valued function (TVF) that reads CSV files from Amazon S3. The function can use either a location object (recommended) or direct credentials to access the data. READ_CSV returns a table with data from the specified CSV file, where each cell is read as TEXT.

Syntax

-- Using LOCATION object (recommended)
READ_CSV ( 
  LOCATION => location_name
  [, COMPRESSION => <file_compression>]
  [, HEADER => { TRUE | FALSE }]
  [, DELIMITER => <field_delimiter>]
  [, QUOTE => { "'" | '"' | SINGLE_QUOTE | DOUBLE_QUOTE}]
  [, NULL_STRING => <null_string>]
  [, ESCAPE => <escape_character>]
  [, SKIP_BLANK_LINES => { TRUE | FALSE }]
  [, EMPTY_FIELD_AS_NULL => { TRUE | FALSE }]
  [, INFER_SCHEMA => { TRUE | FALSE }]
)
|
-- Using static credentials
READ_CSV ( 
  URL => <url>
  [, COMPRESSION => <file_compression>]
  [, AWS_ACCESS_KEY_ID => <aws_access_key_id>]
  [, AWS_SECRET_ACCESS_KEY => <aws_secret_access_key>]
  [, AWS_SESSION_TOKEN => <aws_session_token>]
  [, AWS_ROLE_ARN => <aws_role_arn>]
  [, AWS_ROLE_EXTERNAL_ID => <aws_role_external_id>]
  [, HEADER => { TRUE | FALSE }]
  [, DELIMITER => <field_delimiter>]
  [, QUOTE => { "'" | '"' | SINGLE_QUOTE | DOUBLE_QUOTE}]
  [, NULL_STRING => <null_string>]
  [, ESCAPE => <escape_character>]
  [, SKIP_BLANK_LINES => { TRUE | FALSE }]
  [, EMPTY_FIELD_AS_NULL => { TRUE | FALSE }]
  [, INFER_SCHEMA => { TRUE | FALSE }]
)

Parameters

ParameterDescriptionSupported input types
LOCATIONThe name of a location object that contains the Amazon S3 URL and credentials. Firebolt recommends using LOCATION to store credentials for authentication. For a comprehensive guide, see LOCATION objects. For syntax details, see CREATE LOCATION.IDENTIFIER
URLThe location containing your files in an Amazon S3 bucket. The expected format is s3://{bucket_name}/{full_file_path_glob_pattern}.TEXT
COMPRESSIONThe compression type of the input file. If compression is not set, compression is inferred from the file extension.TEXT
AWS_ACCESS_KEY_IDThe AWS access key ID.TEXT
AWS_SECRET_ACCESS_KEYThe AWS secret access key.TEXT
AWS_SESSION_TOKENThe AWS session token.TEXT
AWS_ROLE_ARNThe AWS role arn.TEXT
AWS_ROLE_EXTERNAL_IDThe AWS role external ID.TEXT
HEADERSet to TRUE if the first row of the CSV file contains a header row containing the column names.BOOL
DELIMITERSpecify the character used to separate fields. The default delimiter is a comma (,).TEXT
QUOTESpecify the character used for quoting fields. The default is double quote ("). If a single quote is specified, the quote character will be set to ('). Accepts only DOUBLE_QUOTE, SINGLE_QUOTE, ', or ".TEXT
NULL_STRINGSpecify the string used to represent NULL values. The default is an empty string, which means that empty strings are interpreted as NULL values.TEXT
ESCAPESpecify the character used to escape special characters. The default character is the quote (') character.TEXT
SKIP_BLANK_LINESSet to TRUE to ignore blank lines in the file.BOOL
EMPTY_FIELD_AS_NULLSpecify whether empty fields should be interpreted as NULL values. The default is TRUE. If set to FALSE, empty fields are interpreted as empty strings.BOOL
INFER_SCHEMASpecify whether column data types should be infered from the data, instead of using TEXT.BOOL

Return Type

The result is a table with the data from the CSV file. Each cell is read as a TEXT.

Examples

Using LOCATION object

Best practice

Firebolt recommends using a LOCATION object to store credentials for authentication.

The following code example reads a CSV file from the location specified by my_location, treating the first row as a header containing column names:

SELECT * FROM READ_CSV(
    LOCATION => my_location,
    HEADER => true
);

Using static credentials

SELECT * FROM READ_CSV(
    URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv',
    HEADER => true
);

The url can represent a single file or a glob pattern. If a glob pattern is used, all files matching the pattern will be read. A special column $source_file_name can be used to identify the source file of each row in the result set:

SELECT *, $source_file_name FROM READ_CSV(
    URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/*.csv',
    HEADER => TRUE
);

When using glob patterns, the wildcard (*) can only be used at the end of the path. You can use it with any text before or after, such as *.csv, date=2025*.csv, or data_*.csv.

The pattern will recursively match files in all subdirectories. For example:

SELECT * FROM READ_CSV('s3://firebolt-publishing-public/*.csv')

will read all CSV files in the bucket, including those in subdirectories like help_center_assets/firebolt_sample_dataset/*.csv.

Example

In the following example, the URL is set as the first positional parameter and reads a CSV file:

SELECT * FROM READ_CSV('s3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv');

Returns:

f0f1f2f3f4f5f6f7f8
LevelIDGameIDLevelNameLevelTypeNextLevelMinPointsToPassMaxPointsNumberOfLaps
111Thunderbolt CircuitFastestLap25205
212Velocity ValeFirstToComplete3153010
313Raceway RidgeFastestLap4254020
414Nitro NarrowsFirstToComplete56010010
515Thunder RoadFirstToComplete68015015
616Burnout BoulevardDrift750808
717Speed StreetFastestLap840707
818Racing RavineFastestLap96010020
919Drift DistrictDrift1010025025
10110Acceleration AlleyFirstToCompletenull20050050

Example

The following example accepts URL as a named parameter and reads a CSV file with column names in the first row:

SELECT * FROM READ_CSV(URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv', 
        HEADER => true);

Returns

LevelIDGameIDLevelNameLevelTypeNextLevelMinPointsToPassMaxPointsNumberOfLaps
11Thunderbolt CircuitFastestLap2520520
21Velocity ValeFirstToComplete315301010
31Raceway RidgeFastestLap425402020
41Nitro NarrowsFirstToComplete5601001010
51Thunder RoadFirstToComplete6801501515
61Burnout BoulevardDrift7508088
71Speed StreetFastestLap8407077
81Racing RavineFastestLap9601002020
91Drift DistrictDrift101002502525
101Acceleration AlleyFirstToCompletenull2005005050

Example

The following example reads a CSV with headers and reads empty values as empty strings, rather than NULL values:

SELECT * FROM READ_CSV(URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv',
        HEADER => true, EMPTY_FIELD_AS_NULL => false);

Returns:

LevelIDGameIDLevelNameLevelType
11Thunderbolt CircuitFastestLap2
21Velocity ValeFirstToComplete3
31Raceway RidgeFastestLap4
41Nitro NarrowsFirstToComplete5
51Thunder RoadFirstToComplete6
61Burnout BoulevardDrift7
71Speed StreetFastestLap8
81Racing RavineFastestLap9
91Drift DistrictDrift10
101Acceleration AlleyFirstToComplete

Example

The following example accepts URL as a named parameter, reads a CSV file with column names in the first row, and infers types for all columns. In this example it allows filtering using numeric comparisons, since the MaxPoints and MinPointsToPass columns are properly typed as numbers rather than strings:

SELECT * FROM READ_CSV(URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/levels.csv', 
        HEADER => true, INFER_SCHEMA => true) WHERE "MaxPoints" > 2 * "MinPointsToPass";

Returns

LevelIDGameIDLevelNameLevelTypeNextLevelMinPointsToPassMaxPointsNumberOfLaps
11Thunderbolt CircuitFastestLap2520520
91Drift DistrictDrift101002502525
101Acceleration AlleyFirstToCompletenull2005005050