SQL Pipe syntax

Firebolt supports SQL Pipe syntax, an alternative SQL syntax that uses the |> operator to chain query transformations step by step. SQL Pipe syntax supports the same functionality as standard SQL, but can improve readability by allowing queries to flow in a linear, top-to-bottom structure, which makes it easier to express, compose and understand queries. This syntax was first presented by Google in the research paper SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL.

SQL Pipe syntax has the following structure:

  • Each pipe operator consists of the pipe symbol, |>, an operator name, and arguments: |> operator_name arguments.
  • You can add pipe operators to the end of any valid query.
  • You can apply pipe operators in any order, and apply them as many times as needed.
  • Pipe syntax can be used anywhere that standard syntax is used including in queries, subqueries, and views.
  • You can combine pipe syntax with standard SQL syntax in the same query. For example, a parent query can use standard SQL while the subquery uses pipe syntax, or the other way around.

For more information, see Google’s original research paper, SQL Has Problems. We Can Fix Them: Pipe Syntax In SQL.

Example

The following code example is taken from Query 13 in the TPC-H benchmark suite, a standard set of queries used to measure database performance. Query 13 specifically analyzes customer order patterns to identify how many customers fall into different order-count categories. In standard SQL, the following query counts how many customers have placed a certain number of orders, excluding special requests, and sorts the results by customer count and order count:

SELECT c_count, COUNT(*) AS custdist
  FROM
  (SELECT c_custkey, COUNT(o_orderkey) c_count
    FROM customer
    LEFT OUTER JOIN orders ON c_custkey = o_custkey
                           AND o_comment NOT LIKE '%special%requests%'
   GROUP BY c_custkey
) AS c_orders
GROUP BY c_count
ORDER BY custdist DESC, c_count DESC;

The following query shows the previous query rewritten using pipe syntax:

FROM customer
|> LEFT OUTER JOIN orders ON c_custkey = o_custkey
|> WHERE o_comment NOT LIKE '%special%requests%'
|> AGGREGATE COUNT(o_orderkey) AS c_count  GROUP BY c_custkey
|> AGGREGATE COUNT(*)          AS custdist GROUP BY c_count
|> ORDER BY custdist DESC, c_count DESC;

FROM first

Firebolt supports FROM first queries, which are especially useful with pipe syntax, which builds queries in a step-by-step flow, with each step transforming the result of the previous one. Starting with the FROM clause defines the initial data source, allowing you to clearly chain joins, filters, and transformations. A pipe query typically starts with a FROM clause that references a table, view, subquery, join, or table-valued function.

Examples

Example

The following code example selects all data from the levels table:

FROM levels

Example

The following code example generates a series of numbers from 1 to 5:

FROM generate_series(1,5);

Pipe operators

SELECT pipe operator

The SELECT pipe operator works similarly to the standard SQL SELECT statement, and is used to specify the columns or expressions you want in the result, including column references, scalar expressions, * wildcards, and clauses like DISTINCT or EXCLUDE. The SELECT pipe operator enables flexible data selection within the pipe query flow.

Syntax

|> SELECT [ ALL | DISTINCT ] <select_list>

The <select_list> in the SELECT pipe operator follows the same syntax as the SELECT statement in standard SQL. Specifically, it can contain components including column references, scalar expressions, * wildcards, and the EXCLUDE clause.

Example

The following code example selects the level, name, maxpoints, and pointsperlap columns from the levels table and computes the number of laps by dividing maxpoints by pointsperlap:

FROM levels
|> SELECT level, name, maxpoints, pointsperlap, maxpoints / pointsperlap AS number_of_laps

EXTEND pipe operator

The EXTEND pipe operator adds new computed columns to query results while keeping the existing columns so that you can perform calculations or transformations on your data without losing any original columns. This is useful when you want to enrich your results with additional insights.

Syntax

|> EXTEND <expression> [AS <alias>] [, ...]

The EXTEND operator propagates all the columns from the input and adds computed columns, similar to the SELECT *, <expression> [AS <alias>] syntax in standard SQL. These computed columns can include scalar expressions or window functions defined using the format |> EXTEND <expression> [AS <alias>] [, ...].

Example

The following code example adds two computed columns to the levels table: number_of_laps, calculated by dividing maxpoints by pointsperlap, and total_max_points, which is the cumulative sum of maxpoints ordered by level:

FROM levels
|> EXTEND 
     maxpoints / pointsperlap AS number_of_laps,
     SUM(maxpoints) OVER (ORDER BY level ASC) AS total_max_points

AS pipe operator

The AS pipe operator is used to rename tables or columns within a query, similar to the AS keyword in standard SQL. You can use it to assign aliases to tables or columns, and make your query more readable and simplify complex operations. The AS pipe operator can help clarify the intent of your query by giving meaningful names to columns or tables, improving its structure and understanding within the pipe query flow.

Syntax

|> AS <table_alias> | <table_alias>(<column1_alias>, ...)

Example

The following code example generates a series of numbers from 1 to 5, aliases the result as table(i), and selects the i column from the aliased table:

FROM generate_series(1,5)
|> AS table(i)
|> SELECT table.i

WHERE pipe operator

The WHERE pipe operator filters the input data based on a specified condition, similar to the WHERE clause in standard SQL, and also replaces the HAVING clause in pipe syntax, allowing you to exclude rows that don’t meet the criteria.

Syntax

|> WHERE <condition>

Example

The following code example filters the levels table to include only rows where the leveltype is FastestLap:

FROM levels
|> WHERE leveltype = 'FastestLap'

LIMIT pipe operator

The LIMIT pipe operator restricts the number of rows in the result set, similar to the LIMIT clause in standard SQL, and can optionally use OFFSET to skip a specified number of rows, useful for limiting large result sets, pagination, or testing smaller data subsets.

Syntax

|> LIMIT <count> [OFFSET <start>]

Example

The following code example generates a series of numbers from 1 to 1000 and limits the result to the first 3 rows:

FROM generate_series(1,1000)
|> LIMIT 3

AGGREGATE pipe operator

You can use the AGGREGATE pipe operator to perform either full table aggregation or aggregation across groups, similar to using the GROUP BY clause in standard SQL. Use this operator to apply aggregate functions like SUM, AVG, or COUNT to grouped data.

Unlike SQL, where grouping expressions need to be repeated in both the SELECT and GROUP BY clauses, in pipe syntax, grouping expressions are listed only once in the GROUP BY clause and are automatically included in the output columns. The AGGREGATE operator’s output first includes the grouping expressions, followed by the aggregated expressions, using their assigned aliases as column names.

Syntax for full table aggregation

|> AGGREGATE <aggregate_expression> [AS <alias>] [, ...]

Syntax for aggregation with grouping

|> AGGREGATE [<aggregate_expression> [AS <alias>] [, ...]] GROUP BY <grouping_expression> [, ...]

Examples

The following code example calculates the total sum of maxpoints from the levels table without grouping:

FROM levels
|> AGGREGATE SUM(maxpoints)

The following code example calculates the maximum maxpoints for each leveltype and returns the corresponding level for each maximum value, grouping the results by leveltype:

FROM levels
|> AGGREGATE MAX_BY(level, maxpoints) level, MAX(maxpoints) maxpoints GROUP BY leveltype

ORDER BY pipe operator

Use the ORDER BY pipe operator to sort the input data based on one or more expressions, similar to the ORDER BY clause in standard SQL to organize query results in a meaningful order. You can use ORDER BY to specify the sorting order with options for ascending (ASC) or descending (DESC) order, as well as the handling of NULL values with NULLS FIRST or NULLS LAST.

Syntax

|> ORDER BY <expression> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST] [, ...]

Example

The following code example sorts the levels table by numberoflaps in descending order and then selects the level and numberoflaps columns from the sorted result:

FROM levels
|> ORDER BY numberoflaps DESC
|> SELECT level, numberoflaps

JOIN pipe operator

Use the JOIN pipe operator to combine rows from two or more tables based on a related column and merge datasets in a query similar to the standard SQL JOIN clause. You can perform different types of joins, such as INNER, LEFT, RIGHT, and FULL, to retrieve and merge data based on specific conditions.

Syntax

|> [ INNER | LEFT | RIGHT | FULL ] JOIN <join_table> ON <join_condition>