Skip to main content
Firebolt reads Apache Iceberg tables natively, and can export query results into a new Iceberg table so your data stays open to other query engines.

Quickstart

A LOCATION stores an Iceberg catalog’s or table’s connection and credentials once, so you don’t repeat them in every query. Mounting a whole catalog as a database is the quickest way to make its tables queryable.

Attach a catalog as a database

Mount an external Iceberg catalog with CREATE ICEBERG DATABASE so every table in it is queryable by name, with no per-table setup. The database holds only a pointer to the LOCATION and an optional freshness setting, so tables added to the catalog become visible on the next query:
CREATE LOCATION my_catalog WITH
  SOURCE = ICEBERG
  CATALOG = REST
  CATALOG_OPTIONS = ( URL = 'https://catalog.example.com/v1' WAREHOUSE = 'analytics' )
  CREDENTIALS = (
    OAUTH_CLIENT_ID = '<client_id>'
    OAUTH_CLIENT_SECRET = '<client_secret>'
  );

CREATE ICEBERG DATABASE lake WITH
  LOCATION = 'my_catalog'
  MAX_STALENESS = '30 seconds';

-- Equivalent to: SELECT * FROM READ_ICEBERG(LOCATION => 'my_catalog', NAMESPACE => 'sales', TABLE => 'orders') LIMIT 10;
SELECT * FROM lake.sales.orders LIMIT 10;
Supported catalog types are FILE_BASED, REST, AWS_GLUE, SNOWFLAKE_OPEN_CATALOG, and DATABRICKS_UNITY. For details and limitations, see the CREATE ICEBERG DATABASE reference.

Register a single table

To expose one table instead of a whole catalog, point a LOCATION at that table and register it with CREATE ICEBERG TABLE. Firebolt infers the schema from the Iceberg metadata, and you can then query the table like any managed table:
CREATE LOCATION my_table_location WITH
  SOURCE = ICEBERG
  CATALOG = FILE_BASED
  CATALOG_OPTIONS = ( URL = 's3://my-bucket/path/to/iceberg/table' )
  CREDENTIALS = ( AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/IcebergAccess' );

CREATE ICEBERG TABLE lineitem LOCATION = 'my_table_location';

-- Equivalent to: SELECT * FROM READ_ICEBERG(LOCATION => 'my_table_location') LIMIT 10;
SELECT * FROM lineitem LIMIT 10;

Query a table ad hoc

For a one-off read with no setup, use the READ_ICEBERG table-valued function:
SELECT * FROM READ_ICEBERG(URL => 's3://my-bucket/path/to/iceberg/table') LIMIT 10;

Import / Export

You can also ingest Iceberg data into Firebolt-managed storage using any of these methods or the COPY FROM statement. Managed tables are built for low-latency, high-concurrency real-time analytics and usually offer better price/performance, because they support additional index types and compaction; see Storage and indexing. To go the other way, export query results from Firebolt into a new Iceberg table using CREATE ICEBERG TABLE AS SELECT. For the full set of Iceberg functions, including the partition transform functions used in PARTITION BY clauses, see the Iceberg functions reference.

Best practices

  • Register tables you query often. Use CREATE ICEBERG TABLE to register a table in Firebolt’s catalog so you can query it with regular SELECT statements, or CREATE ICEBERG DATABASE to mount an entire catalog and query its tables by name. Reserve READ_ICEBERG for ad hoc reads of tables you do not want to register.
  • Store credentials in a LOCATION object. A LOCATION centralizes credential management and avoids specifying individual credentials in each query.
  • Set MAX_STALENESS for tables that tolerate slightly stale reads. This caches catalog metadata and vended credentials and typically cuts query latency by tens to hundreds of milliseconds. See Configurable data freshness with MAX_STALENESS.

Supported features and limitations

At a glance:
CapabilitySupport
CatalogsFILE_BASED, REST, AWS_GLUE, SNOWFLAKE_OPEN_CATALOG, DATABRICKS_UNITY
Data filesApache Parquet on Amazon S3
Spec versionsIceberg v1 and v2
WritesExport only, via CREATE ICEBERG TABLE AS SELECT; no DML
Positional deletesSupported
Equality deletesSupported, except on dropped columns and REAL or DOUBLE PRECISION columns
Deletion vectors (v3)Not supported
Schema evolutionSupported, except type promotion and non-null initial-default
Partition evolutionSupported
Time travelNot supported
A few details and exceptions apply on top of the table above:
  • See CREATE LOCATION (Iceberg) for the parameters and credentials each catalog type takes.
  • Cross-region reads from S3 are disabled by default because they can incur additional cost. Enable them per query with the cross_region_request_mode setting.
  • When a partitioned table contains equality delete files, all data and equality delete files must be written under the table’s current partition spec.
  • The data types variant, geometry, and geography are not supported.
  • Nested complex types (struct, list, map) nested inside another complex type are read as nullable even when Iceberg defines the field as non-nullable.
  • Returning partition values for identity transforms from partition metadata is not supported.

Performance

Choosing Iceberg over managed tables is a performance trade-off, and Firebolt accelerates Iceberg queries with caching, pruning, co-located joins, and writer tuning. For when to choose Iceberg over Firebolt-managed tables and the full tuning guidance, see the Iceberg performance guide.