This documentation is related to an older version of Firebolt. For the most current documentation, see Firebolt documentation.
SQL functions
Use the alphabetical list in the navigation pane to find the syntax for commands that you already know.
Use the functional list below to find commands for a specific task area that you’re working in.
-
Aggregate array functions
These functions work on array-typed columns, but instead of being applied row by row, they combine the results of all the arrays belonging to each of the groups defined by theGROUP BY
clause. -
Aggregation functions
These functions perform a calculation across a set of rows, returning a single value. -
Array functions
Used for the manipulation and querying ofARRAY
-typed columns, such as transforming and filtering. Includes Lambda functions. -
Conditional and miscellaneous functions
These functions include various methods for modifying data types and applying conditional operations. -
Date and time functions
Functions for manipulating date and time data types. -
JSON functions
These functions extract and transform JSON into Firebolt native types, or JSON sub-objects. They are used either during the ELT process or applied to columns storing JSON objects as plainTEXT
. -
Numeric functions
Functions for manipulating data types includingINTEGER
,BIGINT
,DOUBLE PRECISION
, and other numeric types. -
String functions
Functions for manipulating string data types -
Window functions
These functions perform a calculation across a specified set of table rows.
Aggregate array functions
Aggregate semi-structured functions work globally on all the arrays in a given column expression, instead of a row-by-row application.
At their simplest form (without a GROUP BY
clause) - they will provide the result of globally applying the function on all of the elements of the arrays in the column expression specified as their argument. For example, ARRAY_SUM_GLOBAL
will return the sum of all the elements in all the array of the given column. ARRAY_MAX_GLOBAL
will return the maximum element among all of the elements in all of the arrays in the given column expression.
When combined with a GROUP BY
clause, these operations will be performed on all of the arrays in each group.
Aggregation functions
Array functions
Lambda functions
For more information about using Lambda functions, see Manipulating arrays with Lambda functions.
Binary functions
Conditional and miscellaneous functions
Date and time functions
You are looking at functions for Firebolt’s redesigned date and timestamp types. These types were introduced in DB version 3.19 under the names
PGDATE
,TIMESTAMPNTZ
andTIMESTAMPTZ
, and synonymsDATE
,TIMESTAMP
andTIMESTAMPTZ
made available in DB version 3.22.If you worked with Firebolt before DB version 3.22, you might still be using the legacy date and timestamp types. Determine which types you are using by executing the query
SELECT EXTRACT(CENTURY FROM DATE '2023-03-16');
. If this query returns a result, you are using the redesigned date and timestamp types and can continue with this documentation. If this query returns an error, you are using the legacy date and timestamp types and can find legacy documentation here, or instructions to reingest your data to use the new types here.
JSON functions
The reference for each JSON function uses common syntax and conventions as outlined below.
JSON pointer expression syntax
The placeholder <json_pointer_expression>
indicates where you should use a JSON pointer, which is a way to access specific elements in a JSON document. For a formal specification, see RFC6901.
A JSON pointer starts with a forward slash (/
), which denotes the root of the JSON document. This is followed by a sequence of property (key) names or zero-based ordinal numbers separated by slashes. You can specify property names or use ordinal numbers to specify the Nth property or the Nth element of an array.
The tilde (~
) and forward slash (/
) characters have special meanings and need to be escaped according to the guidelines below:
- To specify a literal tilde (
~
), use~0
- To specify a literal slash (
/
), use~1
For example, consider the JSON document below.
{
"key": 123,
"key~with~tilde": 2,
"key/with/slash": 3,
"value": {
"dyid": 987,
"keywords" : ["insanely","fast","analytics"]
}
}
With this JSON document, the JSON pointer expressions below evaluate to the results shown.
Pointer | Result | Notes |
---|---|---|
/ | { "key": 123, "key~with~tilde": 2, "key/with/slash": 3, "value": { "dyid": 987, "keywords" : ["insanely","fast","analytics"] } | Returns the whole document. |
/key | 123 | |
/key~0with~0tilde | 2 | Indicates the value associated with the key~with~tilde property name. |
/key~1with~1slash | 3 | Indicates the value associated with the key/with/slash property name. |
/0 | 123 | Uses an ordinal to indicate the value associated with the key property name. The key property is in the first 0-based position. |
/value/keywords/2 | analytics | Indicates the element “analytics”, which is in the third 0-based position of the array value associated with they keywords property. |
Supported type parameters
Some functions accept a data type parameter, indicated in this reference with the <expected_type>
placeholder. This parameter specifies the expected type as indicated by <json_pointer_expression>
. The <expected_type>
is specified using a string literal that corresponds to supported Firebolt SQL data types. The type parameter does not accept all SQL types because the JSON type system has fewer types than SQL and must be one of the following:
INT
– used for integers as well as JSON boolean.DOUBLE
– used for real numbers. It also works with integers. For performance reasons, favor usingINT
when the values in the JSON document are known integers.TEXT
– used for strings.ARRAY(<type>)
– indicates an array where<type>
is one ofINT
,DOUBLE
, orTEXT
.
The following data types are not supported: DATE
, DATETIME
, FLOAT
(for real numbers, use DOUBLE
).
JSON common example
Usage examples for JSON functions in this reference are based on the JSON document below, which is indicated using the <json_common_example>
placeholder.
{
"key": 123,
"value": {
"dyid": 987,
"uid": "987654",
"keywords" : ["insanely","fast","analytics"],
"tagIdToHits": {
"map": {
"1737729": 32,
"1775582": 35
}
},
"events":[
{
"EventId": 547,
"EventProperties" :
{
"UserName":"John Doe",
"Successful": true
}
},
{
"EventId": 548,
"EventProperties" :
{
"ProductID":"xy123",
"items": 2
}
}
]
}
}
Numeric functions
String functions
Window functions
Some functions support an optional frame_clause
.
The frame_clause
can be one of the following:
{ RANGE | ROWS } <frame_start>
{ RANGE | ROWS } BETWEEN <frame_start> AND <frame_end>
where <frame_start>
and <frame_end>
is one of the following:
UNBOUNDED PRECEDING
offset PRECEDING
CURRENT ROW
offset FOLLOWING
UNBOUNDED FOLLOWING
The frame_clause
specifies the set of rows constituting the window frame within the current partition. The frame can be specified in RANGE
or ROWS
mode; in each case, the frame runs from the <frame_start>
to the <frame_end>
. If <frame_end>
is omitted, the end defaults to CURRENT ROW
.
Usage
-
The default framing option is
RANGE UNBOUNDED PRECEDING
, which is the same asRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. With anORDER BY
clause, this sets the frame to be all rows from the partition start through the current row’s lastORDER BY
peer. Without anORDER BY
clause, all rows of the partition are included in the window frame, since all rows become peers of the current row. -
The number of rows to the end of the frame is limited by the number of rows to the end of the partition; for rows near the partition ends, the frame might contain fewer rows than elsewhere.
UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING
- A
<frame_start>
ofUNBOUNDED PRECEDING
means that the frame starts with the first row of the partition. Similarly a<frame_end>
ofUNBOUNDED FOLLOWING
means that the frame ends with the last row of the partition.
CURRENT ROW
-
In
RANGE
mode, a<frame_start>
ofCURRENT ROW
means the frame starts with the current row’s first peer row (a row that the window’sORDER BY
clause sorts as equivalent to the current row), while a<frame_end>
ofCURRENT ROW
means the frame ends with the current row’s last peer row. -
In
ROWS
mode,CURRENT ROW
simply means the current row.
offset PRECEDING, offset FOLLOWING
- For the
offset PRECEDING
andoffset FOLLOWING
frame options, the offset must be an expression not containing any variables, aggregate functions, or window functions. The meaning of the offset depends on the frame mode:-
In
ROWS
mode, the offset must yield a non-null, non-negative integer, and the option means that the frame starts or ends the specified number of rows before or after the current row. -
In
RANGE
mode, these options require that theORDER BY
clause specify exactly one column. The offset specifies the maximum difference between the value of that column in the current row and its value in preceding or following rows of the frame. The data type of the offset expression varies depending on the data type of the ordering column. For numeric ordering columns it is typically of the same type as the ordering column. For DATE or TIMESTAMP ordering columns, it is an interval. For example, if the ordering column is of type DATE or TIMESTAMP, one could writeRANGE BETWEEN '1 day' PRECEDING AND '10 days' FOLLOWING
. The offset is still required to be non-null and non-negative, though the meaning of “non-negative” depends on the data type. -
In
ROWS
mode,0 PRECEDING
and0 FOLLOWING
are equivalent toCURRENT ROW
. This normally holds inRANGE
mode as well, for an appropriate data-type-specific meaning of “zero”.
-
Restrictions
frame_start
cannot beUNBOUNDED FOLLOWING
frame_end
cannot beUNBOUNDED PRECEDING
frame_end
cannot appear earlier in the above list offrame_start
andframe_end
options than theframe_start
choice does. For exampleRANGE BETWEEN CURRENT ROW AND offset PRECEDING
is not allowed, butROWS BETWEEN 7 PRECEDING AND 8 PRECEDING
is allowed, even though it would never select any rows.
Example
The example below is querying test scores for students in various grade levels. Unlike a regular AVG()
aggregation, the window function allows us to see how each student individually compares to the average test score for their grade level, as well as compute the average test score while looking at different slices of the data for different grade levels – narrowing down the set of rows that constitutes the window using framing options such as PRECEDING or FOLLOWING.
SELECT
first_name,
grade_level,
test_score,
ROUND(AVG(test_score) OVER (PARTITION BY grade_level), 2) AS test_score_avg,
ROUND(AVG(test_score) OVER (PARTITION BY grade_level ORDER BY test_score ASC ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING), 2) AS avg_1p_1f,
ROUND(AVG(test_score) OVER (PARTITION BY grade_level ORDER BY test_score ASC ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING), 2) AS avg_2p_2f,
ROUND(AVG(test_score) OVER (PARTITION BY grade_level ORDER BY test_score ASC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING), 2) AS avg_up_2f
FROM
class_test
ORDER BY
grade_level,
test_score;
Returns:
' | first_name | grade_level | test_score | test_score_avg | avg_1p_1f | avg_2p_2f | avg_up_2f |
+------------+-------------+------------+----------------+-----------+-----------+-----------+
' | Frank | 9 | 76 | 81.33 | 77 | 77.67 | 77 |
' | Jojo | 9 | 78 | 81.33 | 77.67 | 78.25 | 77.67 |
' | Iris | 9 | 79 | 81.33 | 79 | 79.6 | 78.25 |
' | Peter | 9 | 80 | 81.33 | 81.33 | 82.4 | 79.6 |
' | Sammy | 9 | 85 | 81.33 | 85 | 83.5 | 81.33 |
' | Humphrey | 9 | 90 | 81.33 | 87.5 | 85 | 81.33 |
' | Yolinda | 10 | 30 | 68.2 | 44.5 | 55.67 | 44.5 |
' | Albert | 10 | 59 | 68.2 | 55.67 | 63 | 55.67 |
' | Deborah | 10 | 78 | 68.2 | 74 | 68.2 | 63 |
' | Mary | 10 | 85 | 68.2 | 84 | 77.75 | 68.2 |
' | Shawn | 10 | 89 | 68.2 | 87 | 84 | 68.2 |
' | Carol | 11 | 52 | 73 | 60 | 64.33 | 60 |
' | Larry | 11 | 68 | 73 | 64.33 | 67 | 64.33 |
' | Wanda | 11 | 73 | 73 | 72 | 68.8 | 67 |
' | Otis | 11 | 75 | 73 | 74.67 | 77.2 | 68.8 |
' | Shangxiu | 11 | 76 | 73 | 81.67 | 79.5 | 73 |
' | Roseanna | 11 | 94 | 73 | 85 | 81.67 | 73 |
' | Thomas | 12 | 66 | 89 | 77.5 | 82.33 | 77.5 |
' | Jesse | 12 | 89 | 89 | 82.33 | 85 | 82.33 |
' | Brunhilda | 12 | 92 | 89 | 91.33 | 86.8 | 85 |
' | Charles | 12 | 93 | 89 | 93 | 93.6 | 86.8 |
' | Franco | 12 | 94 | 89 | 95.67 | 94.75 | 89 |
' | Gary | 12 | 100 | 89 | 97 | 95.67 | 89 |
+------------+-------------+------------+----------------+-----------+-----------+-----------+