Guide to understanding and tuning Iceberg tables and external data scans
read_iceberg
to read a specific open table format,read_parquet
to read a specific file format, andread_iceberg
table-valued function.
Iceberg has surged in popularity in recent years because it provides ACID transactions on data lakes and allows interoperability between query engines.
The transactionality eliminates âdata swampsâ when a job fails halfway through, while the interoperability prevents vendor lock-in, enabling users to mix and match query engines according to their strengths.
While these properties alone make Iceberg an excellent choice for many workloads, choosing Iceberg over a plain collection of data files on object storage unlocks a host of further optimizations in Fireboltâs query engine.
Icebergâs table metadata provides detailed information about the table and the files containing its data that Firebolt can leverage to optimize queries, which is not available when querying plain collections of data files.
COUNT(*)
query on an Iceberg table is answered purely using the metadata.
No actual data files are scanned in this case.MAX_STALENESS
) has not changed, Firebolt can reuse this cached subresult.
This is incredibly effective for dashboarding workloads or iterative query refinement, where queries often share common substructures.MAX_STALENESS
is used, many queries do not require any access to the Iceberg catalog or object storage.
(Sub-)result caches apply exactly the same way as when querying managed tables.
However, the backing file format of Iceberg tables, Apache Parquet, has design decisions that make it less suitable for ultra-low-latency queries than Fireboltâs managed table format, which has been extensively tuned to provide the lowest latency possible.
Parquet uses complex encodings and compression schemes to achieve the smallest file sizes, with many writers tuned to produce files that are ideal for batch processing.
Large row group sizes, non-aligned page boundaries, and heavy page compression make such Parquet files slow to read.
To achieve low-latency reads on Iceberg tables, a heavily tuned writer is essential.
Data should ideally be distributed in a way that mimics Fireboltâs primary indexes, sorted by frequently filtered columns.
Use Iceberg tables when:
read_parquet
).
Iceberg table metadata has multiple layers, requiring a number of hops to object storage: the catalog provides the URL to the metadata file, which points to a number of manifest lists, each of which points to one or more manifest files, which finally point to the data files.
As object storage latencies are often above 50ms, uncached Iceberg reads typically have at least 200ms of pure metadata overhead.
All of this overhead can be eliminated with caching and MAX_STALENESS
.
The complete metadata hierarchy is automatically cached on local disk (SSD) and in memory, and data files are cached on local disk (SSD).
This means metadata and frequently accessed data from Iceberg tables can be served from much faster local tiers, just like with Firebolt-managed storage.
For security reasons, cached data is tied to the credentials used for access: when using different credentials to access the same table, cached data cannot be used as the validity of the credentials is unknown.
Cached data is not tied to a particular snapshot, and caches typically become only partially invalidated after a new transaction is committed: a transaction that inserts additional rows will result in a new metadata file pointing to a new manifest list, which references a new manifest file pointing to the exact same data files along with one or more new data files.
Cache entries for all of the old files remain valid, and only the new files have to be fetched for the next query, while the rest can be served from cache.
MAX_STALENESS
MAX_STALENESS
parameter of read_iceberg
lets you define this tolerance (for example, INTERVAL '30 seconds'
).
When specified, Firebolt can use a cached metadata file and vended credentials if they are within the allowed staleness, avoiding costly round trips to the Iceberg catalog.
The default is 0 seconds, ensuring queries always see the latest snapshot if no staleness is specified.
Setting MAX_STALENESS
will typically reduce query latency by tens to hundreds of milliseconds and is highly recommended for scans that are allowed to read slightly stale data.
Note that stale reads still maintains the transactionality of Iceberg tablesâthey might simply read a slightly older snapshot if the table was modified during the MAX_STALENESS
interval.
enable_iceberg_partitioned_scan
setting to enable these optimizations when the number of partitions is high enough and their size is sufficiently balanced to make this worthwhile.