SELECT queries in a single script, you must them with a semicolon (;). Firebolt also supports CREATE TABLE...AS SELECT (CTAS). For more information, see CREATE TABLE…AS SELECT.
Syntax
SELECT
SELECT list defines the columns that it returns. Each <select_expression> in the SELECT list can be either an individual expression or a wildcard.
You cannot select only partitioned or virtual columns. Selecting both partitioned or virtual columns together with regular columns is supported, but selecting only partitioned or virtual columns is not.
SELECT expression
SELECT list return a single value and generate one output column. You can define the column name using an explicit alias with the AS clause, or, for expressions without explicit alias, the output column name is automatically generated.
Expressions can reference any column from the FROM clause, but cannot reference other columns produced by the same SELECT list. The expressions can use scalar functions, aggregate functions, window functions or subqueries, as long as they return a single value.
Example
The following code retrieves thecurrentscore, currentspeed, and the product of currentlevel and playterid as score_information from the playstats` table:
SELECT wildcard
- The wildcard symbol (
*) expands to include all columns in theFROMclause. <table_name>.*expands to include all columns specified in theFROMclause for the table named<table_name>EXCLUDEdefines columns which are removed from the previous expansion.
SELECT DISTINCT
TheSELECT DISTINCT statement removes duplicate rows.
SELECT ALL
TheSELECT ALL statement returns all rows. SELECT ALL is the default behavior.
WITH
TheWITH clause refactors subqueries so that you can define them once and reference them within the main query. This simplifies the hierarchy of the main query, enabling you to avoid using multiple nested sub-queries.
In order to reference the data from the WITH clause, a name must be specified for it. This name is then treated as a temporary relation table during query execution.
The primary query and the queries included in the WITH clause are all run at the same time; WITH queries are evaluated only once every time the main query runs, even if the clause is referred to by the main query more than once.
Materialized common table expressions
The query hintMATERIALIZED or NOT MATERIALIZED controls whether common table expressions (CTEs) produce an internal results table that is cached in engine RAM (MATERIALIZED) or calculated each time the sub-query runs. NOT MATERIALIZED is the default. MATERIALIZED must be specified explicitly.
Materialized results can be accessed more quickly in some circumstances. By using the proper materialization hint, you can control when a CTE gets materialized and improve query performance. We recommend the MATERIALIZED hint to improve query performance in the following circumstances:
- The CTE is reused at the main query level more than once.
- The CTE is computationally expensive, producing a relatively small number of rows.
- The CTE calculation is independent of the main query, and no external optimizations from the main table are needed for it to be fast.
- The materialized CTE fits into the nodes’ RAM.
Syntax
| Component | Description |
|---|---|
<subquery_table_name> | A unique name for a temporary table. |
<subquery> | Any query statement. |
Example
The following example retrieves all players who have subscribed to receive the game newsletter, having the results of theWITH query in the temporary table nl_subscribers.
The results of the main query then list the nickname and email for those customers, sorted by nickname.
Reusable common table expressions
Even stronger thanMATERIALIZED, you can mark a CTE as reusable.
This means that the results of the CTE stay cached in engine RAM and can be reused across different queries.
For MATERIALIZED REUSABLE CTEs, Firebolt materializes the full CTE results of the query, possibly even including unused columns that could otherwise be pruned.
This ensures that different queries specifying the same CTE can use the same cache entry, even if they need different columns.
Refer to Understand query performance and subresults for more details on the subresult cache.
REUSABLE can only be applied if the result of the CTE can be cached, which can be subtle in some cases.
Refer to Limitations for more details.
Query execution of a CTE that is explicitly marked as REUSABLE fails if it contains statements that cannot be cached.information_schema.engine_caches view:
FROM
Use theFROM clause to list the tables and any relevant join information and functions necessary for running the query.
Syntax
| Component | Description |
|---|---|
<from_item> | Indicates the table or tables from which the data is to be retrieved. |
Example
In the following example, the query retrieves all entries from theplayers table for which the agecategory value is “56+”.
FROM first
Firebolt allows using the FROM clause before the SELECT clause. The previous example can also be written as follows:
SELECT clause and use only the FROM clause in the query as shown in the following code example:
JOIN
AJOIN operation combines rows from two data sources, such as tables or views, and creates a new table of combined rows that can be used in a query.
JOIN operations can be used with an ON clause for conditional logic or a USING clause to specify columns to match.
JOIN with ON clause syntax
| Parameters | Description |
|---|---|
<join_table1> | A table or view to be used in the join operation. |
<join_table2> | A second table or view to be used in the join operation. |
ON <join_condition> | One or more BOOLEAN comparison expressions that specify the logic to join two specified tables and which columns to compare. For example: ON join_table1.column = join_table2.column. |
JOIN with USING clause syntax
| Component | Description |
|---|---|
<join_table1> | A table or view to be used in the join operation. |
<join_table2> | A second table or view to be used in the join operation. |
USING (column_list) | A list of one or more columns to compare for exact matching. USING is a shortcut to join tables that share the same column names. The specified columns are joined via a basic match condition. The match condition of USING (column_list) is equivalent to ON join_table1.column = join_table2.column |
JOIN types
The type ofJOIN operation specifies which rows are included between two specified tables. If unspecified, JOIN defaults to INNER JOIN.
JOIN types include:
| JOIN Type | Description |
|---|---|
[INNER] JOIN | When used with an ON clause, INNER JOIN includes only rows that satisfy the <join_condition>. When used with a USING clause, INNER JOIN includes rows only if they have matching values for the specified columns in the column_list. |
LEFT [OUTER] JOIN | Includes all rows from <join_table1> but excludes any rows from <join_table2> that don’t satisfy the <join_condition>. LEFT JOIN is equivalent to LEFT OUTER JOIN. |
RIGHT [OUTER] JOIN | Includes all rows from <join_table2> but excludes any rows from <join_table1> that don’t satisfy the <join_condition>. RIGHT JOIN is equivalent to RIGHT OUTER JOIN. |
FULL [OUTER] JOIN | Includes all rows from both tables matched where appropriate with the <join_condition>. FULL JOIN is equivalent to FULL OUTER JOIN. |
CROSS JOIN | Includes every possible combination of rows from <join_table1> and <join_table2>. A CROSS JOIN does not use an ON or USING clause. |
Examples
The followingJOIN examples use two tables, level_one_players and level_two_players. These tables are created and populated with data as follows.
| level_one_players.nickname | level_one_players.currentscore | level_two_players.nickname | level_two_players.currentscore |
|---|---|---|---|
| kennethpark | 11 | aaronbutler | 90 |
| rileyjon | 50 | esimpson | 56 |
| sabrina21 | 90 | ruthgill | 85 |
| steven70 | 50 | adrianachoi | 50 |
INNER JOIN example
TheINNER JOIN example below includes only the rows where the nickname and currenscore values match.
| level_one_players.nickname | level_one_players.currentscore | level_two_players.nickname | level_two_players.currentscore |
|---|---|---|---|
| lauradavis | 90 | lauradavis | 90 |
| hamiltonjorge | 50 | hamiltonjorge | 50 |
| adrian26 | 50 | adrian26 | 50 |
| leahbyrd | 90 | leahbyrd | 90 |
| rachelortiz | 87 | rachelortiz | 87 |
LEFT OUTER JOIN example
The followingLEFT OUTER JOIN example includes all nickname values from the level_one_players table. Any rows with no matching value in the level_two_players table return NULL.
| level_one_players.nickname | level_two_players.nickname |
|---|---|
| kennethpark | kennethpark |
| rileyjon | rileyjon |
| sabrina21 | NULL |
| steven70 | steven70 |
RIGHT OUTER JOIN example
The followingRIGHT OUTER JOIN example includes all nickname values from level_two_players. Any rows with no matching values in the level_one_players table return NULL.
| level_one_players.nickname | level_two_players.nickname |
|---|---|
| kennethpark | kennethpark |
| sabrina21 | sabrina21 |
| rileyjon | rileyjon |
| steven70 | steven70 |
| NULL | aaronbutler |
| NULL | ruthgill |
| NULL | adrianachoi |
FULL OUTER JOIN example
The followingFULL OUTER JOIN example includes all values from num_test and num_test2. Any rows with no matching values return NULL.
| level_one_players.nickname | level_two_players.nickname |
|---|---|
| kennethpark | kennethpark |
| sabrina21 | sabrina21 |
| rileyjon | rileyjon |
| steven70 | steven70 |
| NULL | aaronbutler |
| NULL | ruthgill |
| NULL | adrianachoi |
CROSS JOIN example
ACROSS JOIN produces a table with every combination of row values in the specified columns.
The following example uses two tables with player information, beginner_player and intermediate_player, each with a single level column. The tables contain the following data:
| beginner_player.level | intermediate_player.level |
|---|---|
| 1 | 4 |
| 2 | 5 |
| 3 | 6 |
CROSS JOIN example produces a table of every possible pairing of these rows.
| beginner_player.level | intermediate_player.letter |
|---|---|
| 1 | 4 |
| 1 | 5 |
| 1 | 6 |
| 2 | 4 |
| 2 | 5 |
| 2 | 6 |
| 3 | 4 |
| 3 | 5 |
| 3 | 6 |
UNNEST
UNNEST is a table-valued function (TVF) that transforms an input row containing an array into a set of rows.
The output table repeats rows of the input table for every element of the array.
Every array element is attached to one of the output rows.
If the input array is empty, the corresponding row is eliminated.
Syntax - FROM Clause
Using TVFs such asUNNEST is permitted in FROM clauses as follows:
| Component | Description | Valid values and syntax |
|---|---|---|
<from_items> | The tables containing the array columns that should be unnested. | |
<array_column> | The array columns to unnest, which can be either an array literal or a reference to an array-typed column. | Any valid array literal or column name. |
<row_alias> | An alias for the result row, such as r(x). |
UNNEST operation.
However, the LATERAL keyword is optional.
Syntax - SELECT Clause
When unnesting a single column, the TVF can also be invoked directly in theSELECT clause.
| Component | Description | Valid values and syntax |
|---|---|---|
<select_list> | The regular select list of your SQL query. | |
<array_column> | The array columns to unnest, which can be either an array literal or a reference to an array-typed column. | Any valid array literal or column name. |
<column_alias> | A column alias for the result column, such as x. |
Example
The example is based on the following table:| player | completed_levels |
|---|---|
| kennethpark | [2,5] |
| sabrina21 | [3,6,7] |
| andres | [] |
UNNEST:
| player | completed_levels | completed |
|---|---|---|
| kennethpark | [2,5] | 2 |
| kennethpark | [2,5] | 5 |
| sabrina21 | [3,6,7] | 3 |
| sabrina21 | [3,6,7] | 6 |
| sabrina21 | [3,6,7] | 7 |
UNNEST in the SELECT clause:
WHERE
Use theWHERE clause to define conditions for the query in order to filter the query results. When included, the WHERE clause always follows the FROM clause as part of a command such as SELECT.
Syntax
| Component | Description | Valid values and syntax |
|---|---|---|
<condition> | Indicates the conditions of the query. | Any valid BOOLEAN expression. |
Example
In the following example, the query retrieves all entries from thecustomers table for which the region value is “EMEA”.
players table:
GROUP BY
TheGROUP BY clause groups together input rows. Multiple input rows which have same values of expressions in the GROUP BY clause become a single row in the output. GROUP BY is typically used in conjunction with aggregate functions such as SUM and MIN. Query with GROUP BY clause and without aggregate functions is equivalent to SELECT DISTINCT.
Syntax
| Component | Description |
|---|---|
<grouping_element> | A column reference, a position from the select list, a complex expression, or a GROUPING SETS, ROLLUP, or CUBE clause (see below). |
Example
In the following example, the retrieved results are grouped by thenickname column, and then by the email column.
GROUP BY clause is exactly the same as in the SELECT list, then its position can be used instead.
GROUP BY clause must include all expressions in the SELECT list that do not use aggregate functions.
It may include expressions which are not part of SELECT list.
SELECT list has an expression which is not an aggregate function, and it is not listed in GROUP BY clause.
GROUP BY ALL
For the common case ofGROUP BY clause repeating all the non-aggregate function expressions in the SELECT list, it is possible to use GROUP BY ALL syntax.
It will automatically group by all non-aggregate functions expressions from the SELECT list.
GROUP BY GROUPING SETS
GROUPING SETS are an extension of GROUP BY that allow specifying multiple GROUP BY groups in one statement.
Note that the total number of all grouping sets generated by the combination of all GROUPING SETS, ROLLUP, and CUBE clauses cannot exceed 4096.
Syntax
| Component | Description |
|---|---|
<grouping_set_arg> | A proper <grouping_set>, a singleton set represented by a <grouping_element>, or an empty set represented by a <grand_total>. Each argument produces a result equivalent to a simple GROUP BY <grouping_set_arg>. |
<grouping_set> | A comma separated list of <grouping_element> wrapped in parentheses ( <grouping_element> [, <grouping_element> ] ). |
<grouping_element> | A column reference, a position from the SELECT list or a complex expression. Note that we allow at most 31 distinct grouping elements in the entire GROUP BY clause. |
<grand_total> | The empty grouping set (). |
Example
We want to determine the number of players in each age category, segmented by whether they are subscribed to the newsletter or not. Additionally, we want to find the total number of players per age category and the overall amount of players.(agecategory, issubscribedtonewsletter), agecategory, and ().
Each set corresponds to one of the groups described earlier.
The query using these grouping sets is equivalent to the following:
() corresponds to a query with an aggregate function but without a GROUP BY clause.
For columns not included in a particular grouping set, their values are set to NULL in the resulting rows.
GROUP BY ROLLUP
ROLLUP is syntactic sugar for a common GROUPING SETS use case.
It takes n arguments and produces the n+1 prefixes of these arguments as grouping sets.
Note that the total number of all grouping sets generated by the combination of all GROUPING SETS, ROLLUP, an CUBE clauses cannot exceed 4096.
Syntax
Example
GROUP BY CUBE
CUBE is syntactic sugar for a common GROUPING SETS use case.
It takes n arguments and produces the possible 2n subsets as grouping sets.
Note that the total number of all grouping sets generated by the combination of all GROUPING SETS, ROLLUP, an CUBE clauses cannot exceed 4096.
This means that a GROUP BY clause with a sinle CUBE expression can have at most 12 = log2(4096) arguments.
Syntax
Example
HAVING
TheHAVING clause is used in conjunction with the GROUP BY clause, and is computed after computing the GROUP BY clause and aggregate functions.
HAVING is used to further eliminate groups that don’t satisfy the <condition> by filtering the GROUP BY results.
Syntax
| Component | Description |
|---|---|
<condition> | Indicates the boolean condition by which the results should be filtered. |
UNION [ALL]
TheUNION operator combines the results of two or more SELECT statements into a single query.
UNIONcombines with duplicate elimination.UNION ALLcombines without duplicate elimination.
SELECT statements. Data types of all column parameters must be the same. Multiple clauses are processed left to right. Use parentheses to define an explicit order for processing.
Syntax
| Component | Description |
|---|---|
<select_expression1> | A SELECTstatement. |
<select_expression2> | A second SELECT statement to be combined with the first. |
ORDER BY
TheORDER BY clause sorts a result set by one or more output expressions. ORDER BY is evaluated as the last step after any GROUP BY or HAVING clause. ASC and DESC determine whether results are sorted in ascending or descending order. When the clause contains multiple expressions, the result set is sorted according to the first expression. Rows with the same values for the first expression are then sorted by the second expression, and this process continues for subsequent expressions.
The NULLS FIRST and NULLS LAST options can be used to determine whether NULL values appear before or after non-NULL values in the sort order. By default, NULL values are considered greater than any non-NULL value. NULLS FIRST is the default for descending order, and NULLS LAST is the default for ascending order.
Syntax
| Component | Description | |
|---|---|---|
<expression> | Each expression may specify output columns from SELECT or an ordinal number for an output column by position, starting at one. | |
| `[ ASC | DESC ]` | Indicates whether the sort should be in ascending or descending order. |
| `[ NULLS FIRST | NULLS LAST]` | Indicates whether null values should be included at the beginning or end of the result. NULLS FIRST is the default for DESC order, and NULLS LAST otherwise. |
LIMIT
TheLIMIT clause restricts the number of rows that are included in the result set.
Syntax
| Component | Description | Valid values and syntax |
|---|---|---|
<count> | Indicates the number of rows that should be returned. | An integer. |
OFFSET
TheOFFSET clause specifies a non-negative number of rows that are skipped before returning results from the query.
Syntax
| Component | Description | Valid values and syntax |
|---|---|---|
<start> | Indicates the number of rows that should be skipped. | An integer. |
VALUES Lists
VALUES creates an in-memory “constant table” with one or multiple rows for use in queries.
Each parenthesized list of expressions represents a row.
All rows must have the same number of elements, and corresponding elements in each row must have compatible data types.
As an example:
VALUES can be used anywhere a SELECT is allowed.