Reference material for READ_AVRO function
READ_AVRO
returns a table with data from the specified Avro file and supports all Avro data types.
READ_AVRO
supports only binary encoded Avro files (typically with the .avro
extension). JSON-encoded Avro data is not supported.Parameter | Description | Supported input types |
---|---|---|
LOCATION | The name of a location object that contains the Amazon S3 URL and credentials. Firebolt recommends using LOCATION to store credentials for authentication. LOCATION must be specified as a string literal (e.g., LOCATION => 'my_location' ). Unlike URL , it cannot be used as a positional parameter. For a comprehensive guide, see LOCATION objects. | TEXT |
URL | The location of the Amazon S3 bucket containing your files. The expected format is s3://{bucket_name}/{full_file_path_glob_pattern} . | TEXT |
AWS_ACCESS_KEY_ID | The AWS access key ID. | TEXT |
AWS_SECRET_ACCESS_KEY | The AWS secret access key. | TEXT |
AWS_SESSION_TOKEN | The AWS session token. | TEXT |
AWS_ROLE_ARN | The AWS role ARN. | TEXT |
AWS_ROLE_EXTERNAL_ID | The AWS role external ID. | TEXT |
URL
can be passed as either the first positional parameter or a named parameterAWS_ACCESS_KEY_ID
or AWS_SECRET_ACCESS_KEY
, you must provide bothREAD_AVRO
supports all Avro data types with the following mappings:
Avro Type | Firebolt Type | Notes |
---|---|---|
null | TEXT (nullable) | Standalone null columns are supported and handled as nullable text columns that are always null. Within unions, null specifies nullability for the resulting type |
boolean | BOOLEAN | |
int | INT | |
long | BIGINT | |
float | REAL | |
double | DOUBLE | |
bytes | BYTEA | |
string | TEXT | |
record | STRUCT | Nested structure with named fields |
enum | TEXT | Enum values are converted to their string representation |
array | ARRAY | |
map | ARRAY<STRUCT<key TEXT, value TYPE>> | Maps are converted to arrays of key-value structs |
union | STRUCT or single type | Single-type unions and dual-type unions with null are inferred as single nullable types; multi-type unions become structs |
fixed | BYTEA | Fixed-length byte arrays |
key
: The map key (always TEXT
)value
: The map value (type depends on the Avro map value type)["string"]
): Inferred as the single type directly["null", "string"]
): Inferred as a single nullable type (TEXT
in this case)["string", "int", "record"]
): Converted to structs with nullable fields for each possible typestring
, int
, record
)_<i>
starting from _1
(e.g., record_1
, record_2
, etc.)record
, record_1
, record_2
.
Enums: Avro enums are converted to their string representation.
Null types: The null
type is supported both as a standalone column and within unions. Standalone null
columns are handled as nullable text columns that are always null. Within unions, null
specifies the nullability of the resulting type itself (e.g., ["null", "string"]
creates a nullable text field, and ["null", "string", "int"]
creates a nullable struct where the entire struct can be null).
LOCATION
object to store credentials for authentication.
When using READ_AVRO()
, the URL parameter in the location should contain only Avro files (see location table-valued functions).
id | name | score | active |
---|---|---|---|
1 | Alice | 95.5 | true |
2 | Bob | 87.2 | false |
3 | Charlie | 92.8 | true |
user_id | preferences |
---|---|
101 | [{"key": "theme", "value": "dark"}, {"key": "language", "value": "en"}] |
102 | [{"key": "theme", "value": "light"}, {"key": "notifications", "value": "enabled"}] |
id | contact_info |
---|---|
1 | {"string": "email@example.com", "record": null} |
2 | {"string": null, "record": {"phone": "555-1234", "address": "123 Main St"}} |
LOCATION
object to store credentials for authentication:
URL
can be passed as either the first positional parameter or a named parameter. For example, the following two queries will both read the same file:url
can represent a single file or a glob pattern. If a glob pattern is used, all files matching the pattern will be read. A special column $source_file_name
can be used to identify the source file of each row in the result set:*
) can only be used at the end of the path. You can use it with any text before or after, such as *.avro
, date=2025*.avro
, or data_*.avro
.
The pattern will recursively match files in all subdirectories. For example:
firebolt_sample_avro
directory and all of its subdirectories.