> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firebolt.io/llms.txt
> Use this file to discover all available pages before exploring further.

> Avoid expensive window operations for partitions with a single row

# Optimize for single row window partitions

## Background: How window functions process partitions

When executing window functions, Firebolt must process all rows in a partition to compute results for that partition.
This often requires accumulating all rows from the same partition in memory before producing any output.
For large tables, this can lead to high memory consumption and slower query performance.

Consider this example:

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
select
  *,
  row_number() over (partition by event_id order by event_ts) as row_num
from
  events
```

To compute `ROW_NUMBER` for any row, all rows with the same `event_id` must be accumulated in memory and sorted.
When most partitions contain only a single row, this approach is inefficient.

## How the hint works

The `optimize_for_single_row_window_partitions` hint automatically rewrites your query to optimize for high-cardinality partitions where most partitions contain only a single row.
Currently, this hint optimizes only the `ROW_NUMBER` window function.

When you use this hint, Firebolt automatically rewrites your query to separate single-row partitions from multi-row partitions.
Instead of accumulating all rows in memory, Firebolt only processes multi-row partitions while quickly assigning a value of `1` to single-row partitions.
This significantly reduces memory consumption and improves query performance.

Here's the equivalent manual rewrite that shows what Firebolt does automatically when you use the hint:

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
with non_unique_events as materialized (
  select
    event_id
  from
    events
  group by
    event_id
  having
    count(*) > 1
)
select
  *,
  row_number() over (partition by event_id order by event_ts) as row_num
from
  events
where
  event_id in (select event_id from non_unique_events)
union all
select
  *,
  1 as row_num
from
  events
where
  event_id not in (select event_id from non_unique_events)
```

For the rewritten query, Firebolt no longer needs to accumulate all events in memory.
It only accumulates rows from partitions with more than one row, plus the `event_id` for those partitions.

## Using the hint

Use the `optimize_for_single_row_window_partitions` hint by adding it as a query-level comment at the start of your SQL statement, as shown below.

<CodeGroup>
  ```sql SQL query with optimize_for_single_row_window_partitions hint {2} theme={"theme":{"light":"github-light","dark":"github-dark"}}
  /*!
    optimize_for_single_row_window_partitions
   */
  select
    *,
    row_number() over (partition by event_id order by event_ts) as row_num
  from
    events
  ```

  ```text Plan without optimize_for_single_row_window_partitions hint theme={"theme":{"light":"github-light","dark":"github-dark"}}
  [0] [Projection] events.event_id, events.event_ts, events.data, row_num
   \_[1] [Window] row_num: row_number() OVER (PARTITION BY events.event_id ORDER BY events.event_ts Ascending Last)
     |   [Types]: row_num: bigint not null
      \_[2] [StoredTable] Name: "events"
            [Types]: events.event_id: bigint null, events.event_ts: timestamp null, events.data: text null
  ```

  ```text Plan with optimize_for_single_row_window_partitions hint {8, 19} theme={"theme":{"light":"github-light","dark":"github-dark"}}
  [0] [Projection] events.event_id, events.event_ts, events.data, row_num
   \_[1] [Union]
      \_[2] [Window] row_num: row_number() OVER (PARTITION BY events.event_id ORDER BY events.event_ts Ascending Last)
      | |   [Types]: row_num: bigint not null
      |  \_[3] [Join] Mode: Semi [(events.event_id = events.event_id)]
      |     \_[4] [StoredTable] Name: "events"
      |     |     [Types]: events.event_id: bigint null, events.event_ts: timestamp null, events.data: text null
      |     \_[5] [Materialize]
      |        \_[6] [Projection] events.event_id
      |           \_[7] [Filter] (count_0 > 1)
      |              \_[8] [Aggregate] GroupBy: [events.event_id] Aggregates: [count_0: count(*)]
      |                |   [Types]: count_0: bigint not null
      |                 \_[9] [StoredTable] Name: "events"
      |                       [Types]: events.event_id: bigint null
      \_[10] [Projection] events.event_id, events.event_ts, events.data, c_0: 1
        |   [Types]: c_0: bigint not null
         \_[11] [Join] Mode: AntiNullAsFalse [(events.event_id = events.event_id)]
            \_Recurring Node --> [4]
            \_Recurring Node --> [5]
  ```
</CodeGroup>

## Understanding the query plan

The plan with the hint shows a `Materialize` operator that corresponds to the `MATERIALIZED` CTE in the manually rewritten version of the query.
This materialized CTE is used in both branches of the resulting `UNION ALL` query to identify which partitions have more than one row.
