What late materialization does
Let’s look at the following example query:a
and x
are used to compute the qualifying rows.
Without late materialization we have to read all columns from the table to output them in the final result.
In contrast, with late materialization, an engine only reads the columns a
and x
to find the 10 rows that should be in the result.
Then it only reads the required rows from the other columns.
If there are large columns in the table, like long strings or large arrays, reading such unnecessary data can be very expensive.
Say you have a column description
that contains a string for every row that is 500 characters long on average.
If you have a table with 10 million rows, you will read 5GB of data just for the description
column.
Almost all of this data will be disregarded when evaluating ORDER BY a LIMIT 10
without using it for anything.
We only need the description
column for the 10 rows in the result.
These 10 rows from the description
column amount to only 5 KB in size.
Thanks to late materialization, we can avoid reading almost all 5GB of data and only the data we actually need for the result.
Note that Firebolt’s scan does something very similar when possible.
If you have filter predicates like x = 42
they will be pushed down to the scan.
The scan will only load this column first and prune the data scanned from the other columns.
Queries with very selective filter predicates will already benefit a lot from pruning.
But if a sort (limit) operator filters the rows, late materialization can be very beneficial.
When Firebolt’s optimizer applies late materialization
Firebolt’s optimizer applies late materialization using a straightforward rule:- Find a sort (limit) operator in the query plan that has a limit of at most 10 rows.
- Find all delayable columns that are not used to compute which rows need to be in the result.
- Group the delayable columns by the table they originate from.
- Only delay columns from tables where there is at least one delayable column that likely needs much space (
TEXT
,ARRAY
, orBYTEA
).
Scan
, Projection
, Filter
, Sort
, and Join
.
If there is an operator of a different type between the sort operator and the table scan, late materialization will not be applied.
You can see the node types in the query plan using the EXPLAIN
command (see EXPLAIN for more details).
When to use late materialization
Late materialization is most useful when it allows you to avoid reading a lot of data. The more data you can skip, the bigger the speedup can be. There are two important factors here:- Size of delayed columns: If you have a top-k query that outputs columns with very large entries, late materialization can significantly reduce the amount of scanned data.
- Difference of row counts: If the difference in rows between the output of the scan and the limit of your query is large, late materialization can skip a lot of data. If you have a limit of 10 rows, but the scan outputs 100 million rows, late materialization can skip a lot of data.
Example: Query that benefits from late materialization
late_materialization_max_rows=0
(see how you can control late materialization below).
Example: Queries that do not benefit from late materialization
How you can observe whether late materialization was applied
Late materialization in Firebolt is implemented as a join between the original subquery that ends with a sort operator and a new pruned scan. This means you can see whether late materialization was applied to your query by looking at theEXPLAIN
output.
id
, x
, description
, country_code
.
You can also see that it joins both scans on the columns $tablet_id
and $tablet_row_number
.
These columns are used to uniquely identify a row in the table.
How you can control late materialization
If you want to enable late materialization for a query with a limit larger than 10, you can use theWITH
clause to set the late_materialization_max_rows
setting.
Note that very high values above 10000 might cause performance degradation.
late_materialization_max_rows
setting to 0.