Iceberg

Firebolt reads Apache Iceberg tables natively, and can export query results into a new Iceberg table so your data stays open to other query engines.

Quickstart

A LOCATION stores an Iceberg catalog’s or table’s connection and credentials once, so you don’t repeat them in every query. Mounting a whole catalog as a database is the quickest way to make its tables queryable.

Attach a catalog as a database

Mount an external Iceberg catalog with CREATE ICEBERG DATABASE so every table in it is queryable by name, with no per-table setup. The database holds only a pointer to the LOCATION and an optional freshness setting, so tables added to the catalog become visible on the next query:

CREATE LOCATION my_catalog WITH
  SOURCE = ICEBERG
  CATALOG = REST
  CATALOG_OPTIONS = ( URL = 'https://catalog.example.com/v1' WAREHOUSE = 'analytics' )
  CREDENTIALS = (
    OAUTH_CLIENT_ID = '<client_id>'
    OAUTH_CLIENT_SECRET = '<client_secret>'
  );

CREATE ICEBERG DATABASE lake WITH
  LOCATION = 'my_catalog'
  MAX_STALENESS = '30 seconds';

-- Equivalent to: SELECT * FROM READ_ICEBERG(LOCATION => 'my_catalog', NAMESPACE => 'sales', TABLE => 'orders') LIMIT 10;
SELECT * FROM lake.sales.orders LIMIT 10;

Supported catalog types are FILE_BASED, REST, SNOWFLAKE_OPEN_CATALOG, DATABRICKS_UNITY, AWS_GLUE, and S3_TABLES Nightly Feature. For details and limitations, see the CREATE ICEBERG DATABASE reference.

Register a single table

To expose one table instead of a whole catalog, point a LOCATION at that table and register it with CREATE ICEBERG TABLE. Firebolt infers the schema from the Iceberg metadata, and you can then query the table like any managed table:

CREATE LOCATION my_table_location WITH
  SOURCE = ICEBERG
  CATALOG = FILE_BASED
  CATALOG_OPTIONS = ( URL = 's3://my-bucket/path/to/iceberg/table' )
  CREDENTIALS = ( AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/IcebergAccess' AWS_ROLE_EXTERNAL_ID = 'my-external-id' );

CREATE ICEBERG TABLE lineitem LOCATION = 'my_table_location';

-- Equivalent to: SELECT * FROM READ_ICEBERG(LOCATION => 'my_table_location') LIMIT 10;
SELECT * FROM lineitem LIMIT 10;

For role-based AWS access you can additionally set an external ID. An external ID is a value you choose and control that AWS checks when Firebolt assumes your role, adding a second condition on top of your account’s unique IAM principal. Configuring one is a recommended best practice. See IAM roles.

Query a table ad hoc

For a one-off read with no setup, use the READ_ICEBERG table-valued function:

SELECT * FROM READ_ICEBERG(URL => 's3://my-bucket/path/to/iceberg/table') LIMIT 10;

To inspect the underlying data and delete files of an Iceberg table without reading its rows, use the LIST_ICEBERG_FILES TVF:

SELECT * FROM LIST_ICEBERG_FILES(URL => 's3://my-bucket/path/to/iceberg/table') LIMIT 10;

Import / Export

You can also ingest Iceberg data into Firebolt-managed storage using any of these methods or the COPY FROM statement. Managed tables are built for low-latency, high-concurrency real-time analytics and usually offer better price/performance, because they support additional index types and compaction; see Storage and indexing. To go the other way, export query results from Firebolt into a new Iceberg table using CREATE ICEBERG TABLE AS SELECT. For the full set of Iceberg functions, including the partition transform functions used in PARTITION BY clauses, see the Iceberg functions reference.

Best practices

Register tables you query often. Use CREATE ICEBERG TABLE to register a table in Firebolt’s catalog so you can query it with regular SELECT statements, or CREATE ICEBERG DATABASE to mount an entire catalog and query its tables by name. Reserve READ_ICEBERG for ad hoc reads of tables you do not want to register.
Store credentials in a LOCATION object. A LOCATION centralizes credential management and avoids specifying individual credentials in each query.
Set MAX_STALENESS for tables that tolerate slightly stale reads. This caches catalog metadata and vended credentials and typically cuts query latency by tens to hundreds of milliseconds. See Configurable data freshness with MAX_STALENESS.

Supported features and limitations

At a glance:

Capability	Support
Catalogs	`FILE_BASED`, `REST`, `SNOWFLAKE_OPEN_CATALOG`, `DATABRICKS_UNITY`, `AWS_GLUE`, `S3_TABLES`, Snowflake Horizon Catalog (via `REST`)
Data files	Apache Parquet on Amazon S3
Spec versions	Iceberg v1 and v2
Writes	Export only, via `CREATE ICEBERG TABLE AS SELECT`; no DML
Positional deletes	Supported
Equality deletes	Supported, except on dropped columns and `REAL` or `DOUBLE PRECISION` columns
Deletion vectors (v3)	Not supported
Schema evolution	Supported, except type promotion and non-null `initial-default`
Partition evolution	Supported
Time travel	Not supported

A few details and exceptions apply on top of the table above:

See CREATE LOCATION (Iceberg) for the parameters and credentials each catalog type takes.
Cross-region reads from S3 are disabled by default because they can incur additional cost. Enable them per query with the cross_region_request_mode setting.
When a partitioned table contains equality delete files, all data and equality delete files must be written under the table’s current partition spec.
The data types variant, geometry, and geography are not supported.
Nested complex types (struct, list, map) nested inside another complex type are read as nullable even when Iceberg defines the field as non-nullable.
Returning partition values for identity transforms from partition metadata is not supported.

Performance

Choosing Iceberg over managed tables is a performance trade-off, and Firebolt accelerates Iceberg queries with caching, pruning, co-located joins, and writer tuning. For when to choose Iceberg over Firebolt-managed tables and the full tuning guidance, see the Iceberg performance guide.

Overview

Performance and Observability

Security

Self-Managed

Managed service

Guides

SQL reference

Release notes

API reference

Legal

Quickstart

Attach a catalog as a database

Register a single table

Query a table ad hoc

Import / Export

Best practices

Supported features and limitations

Performance

​Quickstart

​Attach a catalog as a database

​Register a single table

​Query a table ad hoc

​Import / Export

​Best practices

​Supported features and limitations

​Performance

Quickstart

Attach a catalog as a database

Register a single table

Query a table ad hoc

Import / Export

Best practices

Supported features and limitations

Performance