Skip to main content

Background: How window functions process partitions

When executing window functions, Firebolt must process all rows in a partition to compute results for that partition. This often requires accumulating all rows from the same partition in memory before producing any output. For large tables, this can lead to high memory consumption and slower query performance. Consider this example:
select
  *,
  row_number() over (partition by event_id order by event_ts) as row_num
from
  events
To compute ROW_NUMBER for any row, all rows with the same event_id must be accumulated in memory and sorted. When most partitions contain only a single row, this approach is inefficient.

How the hint works

The optimize_for_single_row_window_partitions hint automatically rewrites your query to optimize for high-cardinality partitions where most partitions contain only a single row. Currently, this hint optimizes only the ROW_NUMBER window function. When you use this hint, Firebolt automatically rewrites your query to separate single-row partitions from multi-row partitions. Instead of accumulating all rows in memory, Firebolt only processes multi-row partitions while quickly assigning a value of 1 to single-row partitions. This significantly reduces memory consumption and improves query performance. Here’s the equivalent manual rewrite that shows what Firebolt does automatically when you use the hint:
with non_unique_events as materialized (
  select
    event_id
  from
    events
  group by
    event_id
  having
    count(*) > 1
)
select
  *,
  row_number() over (partition by event_id order by event_ts) as row_num
from
  events
where
  event_id in (select event_id from non_unique_events)
union all
select
  *,
  1 as row_num
from
  events
where
  event_id not in (select event_id from non_unique_events)
For the rewritten query, Firebolt no longer needs to accumulate all events in memory. It only accumulates rows from partitions with more than one row, plus the event_id for those partitions.

Using the hint

Use the optimize_for_single_row_window_partitions hint by adding it as a query-level comment at the start of your SQL statement, as shown below.
/*!
  optimize_for_single_row_window_partitions
 */
select
  *,
  row_number() over (partition by event_id order by event_ts) as row_num
from
  events

Understanding the query plan

The plan with the hint shows a Materialize operator that corresponds to the MATERIALIZED CTE in the manually rewritten version of the query. This materialized CTE is used in both branches of the resulting UNION ALL query to identify which partitions have more than one row.