A table-valued function (TVF) that reads data from Parquet files stored in Amazon S3. The function can use either a location object (recommended) or direct credentials to access the data. READ_PARQUET returns a table with data from the specified Parquet file.

Syntax

-- Using location object (recommended)
READ_PARQUET (
  LOCATION => location_name
)
|
-- Using static credentials
READ_PARQUET (
  URL => <file_url>
  [, AWS_ACCESS_KEY_ID => <aws_access_key_id>]
  [, AWS_SECRET_ACCESS_KEY => <aws_secret_access_key>]
  [, AWS_SESSION_TOKEN => <aws_session_token>]
  [, AWS_ROLE_ARN => <aws_role_arn>]
  [, AWS_ROLE_EXTERNAL_ID => <aws_role_external_id>]
)

Parameters

ParameterDescriptionSupported input types
LOCATIONThe name of a location object that contains the Amazon S3 URL and credentials. Firebolt recommends using LOCATION to store credentials for authentication. For a comprehensive guide, see LOCATION objects. For syntax details, see CREATE LOCATION.IDENTIFIER
URLThe location of the Amazon S3 bucket containing your files. The expected format is s3://{bucket_name}/{full_file_path_glob_pattern}.TEXT
AWS_ACCESS_KEY_IDThe AWS access key ID.TEXT
AWS_SECRET_ACCESS_KEYThe AWS secret access key.TEXT
AWS_SESSION_TOKENThe AWS session token.TEXT
AWS_ROLE_ARNThe AWS role ARN.TEXT
AWS_ROLE_EXTERNAL_IDThe AWS role external ID.TEXT
  • When using static credentials:
    • The URL can be passed as either the first positional parameter or a named parameter
    • If you provide either AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY, you must provide both
    • Providing an AWS session token is optional
    • Credentials are not required for accessing public buckets

Return Type

The result is a table with data from the Parquet files. Columns are read and parsed using their inferred data types.

Best practice

Firebolt recommends using a LOCATION object to store credentials for authentication.

Examples

Example

The following code example reads the first 5 rows from a Parquet file using a LOCATION object to store credentials for authentication:

SELECT * 
FROM READ_PARQUET(
    LOCATION => my_location
) 
LIMIT 5;

Example

The following code example reads the first 5 rows from a Parquet file using static credentials for authentication:

SELECT * 
FROM READ_PARQUET(
    URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/playstats/TournamentID=92/cc2a2a0b4e8b4fb39abf20a956e7cc3e-0.parquet'
) 
LIMIT 5;

Returns

GameIDPlayerIDTimestampSelectedCarCurrentLevelCurrentSpeedCurrentPlayTimeCurrentScoreEventErrorCode
18452022-10-27 13:36:33Solara1000BrakeNoError
18452022-10-27 13:36:33Solara13390.98722RightTurnGraphicsFreeze
18452022-10-27 13:36:34Solara12881.974420TiltNoError
18452022-10-27 13:36:35Solara12602.961653BlockTextNotFound
18452022-10-27 13:36:36Solara11963.948881FullSpeedNoError

Using URL

  • The URL can be passed as either the first positional parameter or a named parameter. For example, the following two queries will both read the same file:
  • Credentials are optional.
SELECT * FROM READ_PARQUET('s3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/playstats/TournamentID=92/cc2a2a0b4e8b4fb39abf20a956e7cc3e-0.parquet');
SELECT * FROM READ_PARQUET(URL => 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/playstats/TournamentID=92/cc2a2a0b4e8b4fb39abf20a956e7cc3e-0.parquet');
  • The url can represent a single file or a glob pattern. If a glob pattern is used, all files matching the pattern will be read. A special column $source_file_name can be used to identify the source file of each row in the result set:
SELECT *, $source_file_name FROM READ_PARQUET('s3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/playstats/*.parquet')

When using glob patterns, the wildcard (*) can only be used at the end of the path. You can use it with any text before or after, such as *.parquet, date=2025*.parquet, or data_*.parquet.

The pattern will recursively match files in all subdirectories. For example:

SELECT * FROM READ_PARQUET('s3://firebolt-publishing-public/*.parquet')

will read all PARQUET files in the bucket, including those in subdirectories like help_center_assets/firebolt_sample_dataset/playstats/*.parquet.