Release notes archive

We provide an archive of release notes for your historical reference.

Firebolt Release Notes - Version 4.11

New Features

Introduced the GEOGRAPHY data type and functions for geospatial data handling [public preview]

Added a new GEOGRAPHY data type and functions for working with geospatial data. Firebolt supports the three industry standard formats Well-Known Text (WKT), Well-Known Binary (WKB), and GeoJSON for geospatial data.

This public preview release includes the following functions:

Added keyboard shortcuts to the Firebolt Develop Space

The user interface in the Firebolt Develop Space added the following keyboard shortcuts:

  • Cmd + Enter – Runs the current query.
  • Cmd+Shift+Enter – Runs all queries in a script.

Added the window function FIRST_VALUE

Added a new FIRST_VALUE window function that returns the first value evaluated in a specified window frame.

Firebolt Release Notes - Version 4.10

New Features

Added CREATE TABLE CLONE to clone an existing table in a database

You can create a clone of an existing table in a database using CREATE TABLE CLONE, which is extremely fast because it copies the table structure and references without duplicating the data. The clone functions independently of the original table. Any changes to the data or schema of either table will not affect the other.

Added 3-part identifier support for specifying databases in queries

You can now reference a database other than the current one in queries by using 3-part identifiers, which specify the database, schema, and object. For example, even if you previously selected a database db by using USE DATABASE db, you can still query a different database by using a query such as SELECT * FROM other_db.public.t. The limitation still exists that every query only addresses a single database.

Added ALTER TABLE ADD COLUMN to add a column to an existing table

You can now use ALTER TABLE ADD COLUMN to add columns to Firebolt-managed tables. This functionality is temporarily limited to tables that were created on Firebolt version 4.10 or higher.

Added support of ALTER TABLE RENAME command You can use ALTER TABLE RENAME to change the name of Firebolt-managed tables. This functionality is temporarily limited to tables created on Firebolt version 4.10 or higher.

Added support for external file access using AWS session tokens

You can now use <AWS_SESSION_TOKEN> with access keys to securely authenticate and access external files on AWS with the following features:

  • The COPY TO and COPY FROM commands.
  • External tables located in an Amazon S3 bucket.
  • The following table-valued functions: read_parquet, read_csv, and list_objects.

Behavior Changes

Enhanced PostgreSQL compliance for casting data types from text to float

Cast from text to floating-point types is now compliant with PostgreSQL with the following improvements:

  1. The correct parsing of positive floats – A plus sign (+) preceding a float is now handled correctly. Example: '+3.4'.
  2. Exponent-only input – Float values starting with an exponent 'e' or 'E' are rejected. Example: 'E4'.
  3. Incomplete exponents – Float values ending with an exponent without a subsequent exponent value are rejected. Example: '4e+'.

Account-level rate limits implemented for the system engine

Firebolt has implemented account-level rate limits to ensure equitable resource usage among all users of the system engine. When these limits are exceeded, requests will be rejected with the following error message: 429: Account system engine resources usage limit exceeded. This rate limit targets accounts with exceptionally high resource consumption. Accounts with typical resource usage should not be affected and require no further action.

Bug Fixes

Corrected runtime reporting

Resolved an issue where the runtime displayed in Firebolt’s user interface and JSON responses omitted including processing times for some query steps.

Resolved “Invalid Input Aggregate State Type” error with aggregating indexes

Fixed an issue where the “invalid input aggregate state type” error could occur when queries read from aggregating indexes that defined a COUNT(*) aggregate function before other aggregate functions. After this fix, such aggregating indexes can now be queried correctly without needing to be rebuilt.

Fixed a rare bug in subresult caching logic

Addressed a rare issue in the logic for caching and reusing subresults that could cause query failures with specific query patterns. This issue did not impact the correctness of query results.

Resolved issue preventing schema owners from granting “ANY” privileges

Fixed an issue where schema owners were unable to grant “ANY” privileges on their schema to other users.
For example:

GRANT SELECT ANY ON SCHEMA public TO ...

Schema owners can now execute this command which allows the specified user or role to perform SELECT operations on any table.

Firebolt Release Notes - Version 4.9

New Features

Added the enable_result_cache setting for controlling query result caching during benchmarking

You can set enable_result_cache to FALSE to disable the use of Firebolt’s result cache, which is set to TRUE by default. Disabling result cashing can be useful for benchmarking query performance. When enable_result_cache is disabled, resubmitting the same query will recompute the results rather than retrieving them from cache. For more information, see Result Cache.

Added LAG and LEAD support for negative offsets.

The second parameter in both LAG and LEAD can now accept negative numbers. Given a negative number, a LAG will become a LEAD and vice versa. For example, LAG(x,-5,3) is the same as LEAD(x,5,3).

Performance Improvements

Faster string searches for case-insensitive simple regular expressions in REGEXP_LIKE

Simple regular expressions in REGEXP_LIKE with case-insensitive matching, using the i flag, now use the same optimized string search implementation as ILIKE, achieving up to three times faster runtimes in observed cases.

Bug Fixes

Empty character classes in regular expressions

Fixed a rare case where empty character classes were mistakenly interpreted as valid character classes instead of being treated as raw characters. In cases like []a], the expression is now correctly interpreted as a pattern that matches any single character from the list ]a, rather than treating [] as an empty character class followed by a].

Trailing backslash in regular expressions

Fixed a rare case where invalid regular expressions with a trailing backslash \ were accepted.

Firebolt Release Notes - Version 4.8

New Features

Introduced new bitwise shift functions BIT_SHIFT_RIGHT and BIT_SHIFT_LEFT

The following bitwise shift functions are now supported:

  • BIT_SHIFT_RIGHT shifts the bits of a number to the right by a specified number of positions, which effectively divides the number by 2 for each position shifted.
  • BIT_SHIFT_LEFT shifts the bits of a number to the left by a specified number of positions, which effectively multiples the number by 2 for each position shifted.

Introduced new trigonometric functions ACOS, ATAN, ASIN, COS, COT, TAN, DEGREES, and PI

The following trigonometric functions are now supported:

  • ACOS calculates the arccosine of a value in radians.
  • ATAN calculates the arctangent of a value in radians.
  • ASIN calculates the arcsine of a value in radians.
  • COS calculates the cosine of a value in radians.
  • COT calculates the cotangent of a value in radians.
  • TAN calculates the tangent of a value in radians.
  • DEGREES converts a value in radians to degrees.
  • PI returns π as a value of type DOUBLE PRECISION.

Introduced the timezone query-level setting with time_zone as an alias

Added the timezone query-level setting. The previous time_zone query setting still works, and is now an alias for timezone.

Introduced new PERCENTILE_CONT and MEDIAN aggregate functions

Added the following aggregate functions:

  • PERCENTILE_CONT calculates a specified percentile of values in an ordered dataset.
  • MEDIAN returns the median of a given column. It is equivalent to PERCENTILE_CONT(0.5): half the values in the column are smaller, and half are bigger than the returned value. If the number of values in the column is even, MEDIAN returns the arithmetic mean of the two middle values.

Added support to meet HIPAA regulations for health information

Added support to meet federal HIPAA regulations to ensure the confidentiality, integrity, and availability of electronic protected health information within the Firebolt platform.

Performance Improvements

Improved expression comparison logic within queries

Improved expression comparison logic to better recognize identical expressions within queries. This enhancement supports a broader range of queries and boosts the overall quality of query plans.

Improving cold reads by reducing the amount of Amazon S3 requests needed to load data

Improved the performance of cold reads by minimizing the number of Amazon S3 requests required to load data. In the case of tiny tablets, this improvement lead to a 50% improvement in performance.

Bug Fixes

Fixed a bug preventing view creation with type conversions to array types

Fixed an issue that prevented users from creating database views that involve type conversion to array types.

DB version 4.7

New Features

Added Snappy compression support to the COPY TO command for PARQUET output format
You can now apply Snappy compression, which is faster than GZIP, when using COPY TO with TYPE=PARQUET. Specify COMPRESSION=SNAPPY within COPY TO to enable this.

Added information_schema.engine_user_query_history view to log only user-initiated queries
Added a new query history view, information_schema.engine_user_query_history, which shows all queries initiated by users. This view filters information from information_schema.engine_query_history view, which logs all engine queries including system-generated ones like UI updates and page-load requests.

Added support for information_schema.enabled_roles
Added a new view information_schema.enabled_roles which lists the roles available in the account.

Added a system setting enable_subresult_cache for controlling subresult reuse
A new system setting enable_subresult_cache allows users to enable or disable caching of query subresults for subsequent reuse. Caching remains enabled by default. This setting allows users to temporarily disabling caching, e.g. for benchmarking purposes.

Added “FROM first” syntax allowing the FROM clause to precede the SELECT clause
Added support for the “FROM first” syntax, which allows placing the FROM clause before the SELECT clause, for example FROM t SELECT a, SUM(b) GROUP BY a. You can now also omit the SELECT clause, as in FROM t.

Introduced a new function GEN_RANDOM_UUID_TEXT to generate a universally unique identifier (UUID)
The new function GEN_RANDOM_UUID_TEXT accepts no arguments and returns a version 4 UUID as defined by RFC-4122 as a TEXT value.

Introduced ~ and !~ operators as aliases for REGEXP_LIKE and NOT REGEXP_LIKE
Added the ~ operator as an alias for REGEXP_LIKE, and the !~ operator, which serves as an alias for NOT REGEXP_LIKE.

Introduced JSON functions JSON_POINTER_EXTRACT_KEYS, JSON_POINTER_EXTRACT_VALUES, JSON_POINTER_EXTRACT_TEXT
The following new JSON functions are now supported:

  • JSON_POINTER_EXTRACT_KEYS extracts keys from a JSON object
  • JSON_POINTER_EXTRACT_VALUES extracts values from a JSON object
  • JSON_POINTER_EXTRACT_TEXT extracts the JSON string value as SQL TEXT

Introduced trigonometric functions RADIANS, SIN, ATAN2
The following trigonometric functions are now supported:

  • RADIANS to convert degrees into radians
  • SIN to compute the sine in radians
  • ATAN2 to calculate the arctangent with two arguments. ATAN2(y,x) is the angle between the positive x-axis and the line from the origin to the point (x,y), expressed in radians.

Introduced new functions to calculate standard deviation and variance for both samples and populations
New functions that accept REAL and DOUBLE inputs and return standard deviations and variances:

  • STDDEV_SAMP - Returns the sample standard deviation of all non-NULL numeric values produced by an expression, which measures how spread out values are in a sample.
  • STDDEV_POP - Returns the population standard deviation of all non-NULL numeric values produced by an expression, which measures how spread out values are in an entire population.
  • VAR_SAMP - Returns the sample variance of all non-NULL numeric values produced by an expression, which measures the average of the squared differences from the sample mean, indicating how spread out the values are within a sample.
  • VAR_POP - Returns the population variance of all non-NULL numeric values produced by an expression. The population variance measures the average of the squared differences from the population mean, indicating how spread out the values are within the entire population.

Introduced new array functions ARRAY_ALL_MATCH and ARRAY_ANY_MATCH
The new functions ARRAY_ALL_MATCH and ARRAY_ANY_MATCH accept an (optional) lambda function and an array and return TRUE if all elements (ARRAY_ALL_MATCH) or any element (ARRAY_ANY_MATCH) satisfy the lambda condition, and FALSE otherwise. When no lambda is passed, the array has to be of type BOOLEAN, and the identity lambda x -> x is used.

Performance Improvements

Improved performance of JSON_EXTRACT, JSON_EXTRACT_ARRAY, and JSON_VALUE functions
Enhanced the performance of the JSON_EXTRACT, JSON_EXTRACT_ARRAY, and JSON_VALUE functions.

Behavior Changes

Updated sorting method for array columns with NULL values to align with PostgreSQL behavior

The sorting method for array columns containing NULL values has been updated to ensure that ASC NULLS FIRST places NULL values before arrays, and DESC NULLS LAST places NULL values after arrays, which aligns with PostgreSQL behavior.

The following code example creates a temporary table tbl which contains three rows: a NULL array, an array with the value 1, and an array with a NULL element. Then, a SELECT statement sorts all rows in ascending order:

WITH tbl(i) AS (
  SELECT NULL::INT[]
  UNION ALL
  SELECT ARRAY[1]::INT[]
  UNION ALL
  SELECT ARRAY[NULL]::INT[]
)
SELECT * FROM tbl ORDER BY i ASC NULLS FIRST;

The query previously returned {NULL}, {1}, NULL, but now returns NULL, {1}, {NULL}.

NULLS FIRST and NULLS LAST apply to the array itself, not to its elements. By default, ascending order (ASC) assumes NULLS LAST, while descending order (DESC) assumes NULLS FIRST when sorting arrays.

Allowed use of the SESSION_USER function without parentheses

The SESSION_USER function can now be used without parentheses, like this: SELECT SESSION_USER. As a result, any column named session_user now needs to be enclosed in double quotes as follows: SELECT 1 AS "session_user" or SELECT "session_user" FROM table.

Bug Fixes

Corrected JSON output format to display NaN values consistently as nan
The JSON output format previously showed some NaN values as -nan. This was corrected to consistently display NaN values as nan in the JSON output.

Resolved an issue with CHECKSUM and HASH_AGG failing when combining literals and table columns
Fixed an issue where the CHECKSUM and HASH_AGG functions failed when used with a combination of literals and table columns.

Fixed a rare inaccuracy that could cause incorrect results on multi-node engines when performing certain UNION ALL operations
Fixed a rare inaccuracy when performing certain UNION ALL operations on subqueries that are the result of aggregations or joins on overlapping but distinct keys, followed by an aggregation or join on the common keys of the subqueries’ aggregations or joins.

Fixed a rare inaccuracy that could cause incorrect results with CTEs using RANDOM() in specific join scenarios
Fixed a rare inaccuracy that caused incorrect results when a common table expression using the RANDOM() function was used multiple times, and at least one of these uses was on the probe side of a join involving a primary index key of the underlying table.

DB version 4.6

September 2024

New Features

COPY TO support for the SNAPPY compression type

COPY TO now supports SNAPPY as a new compression option for Parquet files. This enhancement offers greater flexibility for managing file size and performance, particularly for workloads requiring faster compression. Each file is written in Parquet format, with the specified compression applied to the data pages in the column chunks.

COPY FROM support for filtering by source file metadata

COPY FROM now supports filtering by source file metadata using the WHERE clause.

Added support for vector distance calculations with new functions

Firebolt has added support for vector distance and similarity calculations with the following new functions: VECTOR_COSINE_DISTANCE, VECTOR_MANHATTAN_DISTANCE, VECTOR_EUCLIDEAN_DISTANCE, VECTOR_SQUARED_EUCLIDEAN_DISTANCE, VECTOR_COSINE_SIMILARITY, and VECTOR_INNER_PRODUCT.

Behavior Changes

Introduced SHOW CATALOGS statement and aliased SHOW DATABASES to it while deprecating SHOW DATABASE X

A new statement SHOW CATALOGS now acts as an alias for SHOW DATABASES. The statement SHOW DATABASE X is no longer supported.

COPY FROM now unzips Parquet files with gzip extensions

Before version 4.6, the COPY FROM command did not apply file-level decompression to Parquet files with a .gzip or .gz extension. The command treated these files as standard Parquet files, assuming that any compression existed only within the internal Parquet format structure.

With the release of version 4.6, COPY FROM now processes Parquet files similarly to other formats. When a Parquet file has a .gz or .gzip extension, the command will first decompress the file before reading it as a Parquet format file. Hence, it will now fail while reading internally compressed Parquet files with gzip extensions. Users experiencing issues with loading files after this change should contact the support team at support@firebolt.io for assistance.

Bug Fixes

Fixed a rare bug that caused some query failures from incorrect computation of cacheable subresults

Fixed a rare bug impacting the logic that determined which subresults could be cached and reused. This issue could have caused query failures in certain patterns, but it did not impact the accuracy of the query outcomes.

Updated name of aggregatefunction2 to aggregatefunction in explain output

The name aggregatefunction2 has been updated to aggregatefunction in the EXPLAIN output.

Fixed incorrect results in ARRAY_AGG expressions by excluding NULL values for false conditions in aggregating indexes

Aggregate expressions like ARRAY_AGG(CASE WHEN <cond> THEN <column> ELSE NULL END) previously returned incorrect results by excluding NULL values for rows when the condition was FALSE.

DB version 4.5

September 2024

New Features

Allowed casting from TEXT to DATE with truncation of timestamp-related fields Casting from TEXT to DATE now supports text values containing fields related to timestamps. These fields are accepted, but truncated during conversion to DATE.

The following code example casts the TEXT representation of the timestamp 2024-08-07 12:34:56.789 to the DATE data type. The conversion truncates the time portion, leaving only the date, as follows:

Example:

SELECT '2024-08-07 12:34:56.789'::DATE`

Results in

DATE `2024-08-07`

Added the CONVERT_FROM function

Added the CONVERT_FROM function that converts a BYTEA value with a given encoding to a TEXT value encoded in UTF-8.

Added the BITWISE aggregate functions

Added support for the following functions: BIT_OR (bitwise OR), BIT_XOR (bitwise exclusive OR), and BIT_AND (bitwise AND).

Added the REGEXP_LIKE_ANY function

Added the REGEXP_LIKE_ANY function that checks whether a given string matches any regular expression pattern from a specified list of patterns.

Bug Fixes

Updated created and last_altered column types in information_schema.views from TIMESTAMP to TIMESTAMPTZ The data types of the created and last_altered columns in information_schema.views have been changed from TIMESTAMP to TIMESTAMPTZ.

Fixed runtime constant handling in the sort operator Fixed the handling of runtime constants in the sort operator. Now, the sort operator can be correctly combined with GENERATE_SERIES. For example, the query SELECT x, GENERATE_SERIES(1,7,3) FROM GENERATE_SERIES(1,3) t(x) now correctly displays values 1 to 3 in the first column, instead of just 1.

DB version 4.4

August 2024

New Features

Extended support for date arithmetic

Now you can subtract two dates to get the number of elapsed days. For example, DATE '2023-03-03' - DATE '1996-09-03' produces 9677.

Role-based permissions for COPY FROM and external tables

Added support for role-based permissions (ARNs) to the COPY FROM command and external table operations.

Added trust_policy_role column to information_schema.accounts view for S3 access

Added a new column trust_policy_role to the information_schema.accounts view. This column shows the role used by Firebolt to access customer S3 buckets.

Enabled selection of external tables’ pseudo columns without adding data columns

Users can now select an external table’s pseudo columns (source file name, timestamp, size, and etag) without adding any data columns. For example, select $source_file_timestamp from t_external returns the file timestamps for each row. The query select count($source_file_timestamp) from t_external returns the total number of rows in the external table, similar to count(*). The query select count(distinct $source_file_name) from t_external returns the number of distinct objects containing at least one row in the source S3 location. Regarding count(*) performance, formats like CSV or JSON still require reading the data fully to determine an external file’s row count. However, Parquet files provide the row count as part of the file header, and this is now used instead of reading the full data.

Extended support for arbitrary join conditions, including multiple inequality predicates

We now support more join conditions. As long as there is one equality predicate comparing a left column to a right column of the join (not part of an OR expression), the remaining join condition can now be an arbitrary expression. The limitation on the number of inequality predicates was removed.

New functions URL_ENCODE and URL_DECODE

We added support for the URL_ENCODE and URL_DECODE functions.

New logarithm functions ln, log

We added support for calculating logarithms. The natural logarithm is available using ln(val double precision). The base 10 logarithm is available using log(val double precision). Logarithms with custom bases are available using log(base double precision, val double precision).

New function `SQRT

Added support for the SQRT function to compute the square root.

New functions JSON_VALUE, JSON_VALUE_ARRAY, JSON_EXTRACT_ARRAY

Added support for the functions JSON_VALUE, JSON_VALUE_ARRAY, and JSON_EXTRACT_ARRAY.

New function SESSION_USER

Support has been added for the SESSION_USER function, which retrieves the current user name.

New columns in information_schema.engine_query_history

Added two new columns to information_schema.engine_query_history: query_text_normalized_hash and query_text_normalized.

Breaking Changes

Reserved the keyword GEOGRAPHY, requiring double quotes for use as an identifier

The word GEOGRAPHY is now a reserved keyword and must be quoted using double quotes for use as an identifier. For example, create table geography(geography int); will now fail, but create table "geography" ("geography" int); will succeed.

Deprecated the legacy HTTP ClickHouse headers

We no longer accept or return the legacy HTTP ClickHouse header format X-ClickHouse-*.

Fixed json_value zero-byte handling

The json_value function no longer returns null characters (0x00), as the TEXT datatype does not support them. For example, select json_value('"\u0000"'); now results in an error.

Change default values for NODES and TYPE during CREATE ENGINE

When performing a CREATE ENGINE, the default values for NODES and TYPE parameters have changed. NODES defaults to 2 (previously 1) and TYPE defaults to M (previously S). To create an engine with the previous default values, run the following command:

CREATE ENGINE my_engine WITH NODES=1 TYPE=S

Bug Fixes

Fixed directory structure duplication in the S3 path when using the COPY TO statement with SINGLE_FILE set to FALSE

Fixed an issue in COPY TO when SINGLE_FILE=FALSE. Previously, the specified directory structure in the location was repeated twice in the S3 path. For example, files were output to “s3://my-bucket/out/path/out/path/” instead of “s3://my-bucket/out/path/”.

Fixed the file extension in the S3 path when using the COPY TO statement with GZIP-Parquet format

Fixed an issue in COPY TO when TYPE=PARQUET and COMPRESSION=GZIP, which uses the Parquet file format with internal GZIP compression for the columns. Previously, the output files would have the extension “.parquet.gz”. Now, the extension is “.gz.parquet”.

DB version 4.3

August 2024

New Features

Role-based permissions for COPY FROM and External Table processes

Enabled role-based permissions for COPY FROM and External Table processes.

HLL-based count distinct functions compatible with the Apache DataSketches library

Firebolt now supports count-distinct functions using the HLL (HyperLogLog) algorithm, compatible with the Apache DataSketches library. For details and examples, see documentation on the functions APACHE_DATASKETCHES_HLL_BUILD, APACHE_DATASKETCHES_HLL_MERGE, and APACHE_DATASKETCHES_HLL_ESTIMATE.

Supported additional join conditions and removed the restriction on the number of inequality predicates

Firebolt has added enhanced support for more join conditions. As long as there is one equality predicate comparing a left column to a right column of the join, which is not part of a disjunctive (OR) expression, the remaining join condition can be arbitrary. The previous limitation on the number of inequality predicates has been removed.

Performance Improvements

Multi-node query performance

Firebolt has improved the performance of data transfer between nodes, resulting in faster overall query execution times.

Enhanced Interval Arithmetic Support

Firebolt has enhanced support for interval arithmetic. You can now use expressions of the form date_time + INTERVAL * d, where date_time is a expression of type Date, Timestamp, TimestampTz, and d is an expression of type DOUBLE PRECISION. The interval is now scaled by d before being added to date_time. For example, writing INTERVAL '1 day' * 3 is equivalent to writing INTERVAL '3 days'.

Optimized selective inner and right joins on primary index and partition by columns to reduce rows scanned

Selective inner and right joins on primary index and partition by columns now can now benefit from pruning. This reduces the number of rows scanned by filtering out rows that are not part of the join result early in the process. This optimization works best when joining on the first primary index column or a partition by column. The optimization is applied automatically when applicable, and no action is required. Queries that used this optimization will display “Prune:” labels on the table scan in the EXPLAIN (PHYSICAL) or EXPLAIN (ANALYZE) output.

Bug Fixes

Fixed a bug in the combination of cross join and the index_of function

Resolved an issue where the index_of function would fail when applied to the result of a cross join that produced a single row.

Breaking Changes

Temporarily restricted column DEFAULT expressions in CREATE TABLE statements

Column DEFAULT expressions in CREATE TABLE statements have been temporarily restricted, they can only consist of literals and the following functions: CURRENT_DATE(), LOCALTIMESTAMP(), CURRENT_TIMESTAMP(), NOW(). Existing tables with column DEFAULT expressions are not affected.

Underflow detection while casting from TEXT to floating point data types

Firebolt now detects underflow, a condition where a numeric value becomes smaller than the minimum limit that a data type can represent, when casting from TEXT to floating point data types. For example, the query select '10e-70'::float4; now returns an error, while it previously returned 0.0.

Returning query execution errors in JSON format through the HTTP API

Firebolt’s HTTP API now returns query execution errors in JSON format, allowing for future enhancements like including metadata such as error codes, or the location of a failing expression within the SQL script.

Changed default of case_sensitive_column_mapping parameter in COPY FROM

The default value for the CASE_SENSITIVE_COLUMN_MAPPING parameter in COPY FROM is now FALSE, meaning that if a target table contains column names in uppercase and the source file to ingest has the same columns in lowercase, the ingestion will consider them the same column and ingest the data.

extract function returns Numeric(38,9) for Epoch, second, and millisecond extraction

The result data type of the extract function for epoch, second, and millisecond was changed to return the type Numeric(38,9) instead of a narrower Numeric type. For example, select extract(second from '2024-04-22 07:10:20'::timestamp); now returns Numeric(38,9) instead of Numeric(8,6).

DB version 4.2

July 2024

New features

New ntile window function

Firebolt now supports the ntile window function. Refer to our NTILE documentation for examples and usage.

Enhancements, changes and new integrations

Improved query performance

Queries with “SELECT [project_list] FROM [table] LIMIT [limit]” on large tables are now significantly faster.

Updated table level RBAC

Table level RBAC is now supported by Firebolt. This means that RBAC checks also cover schemas, tables, views and aggregating indexes. Refer to our RBAC docs for a detailed overview of this new feature. The new Firebolt version inhibits the following change:

  • System built-in roles are promoted to contain table level RBAC information. This means that new privileges are added to account_admin, system_admin and public roles. The effect is transparent— any user assigned with those roles will not be affected.

Removal of Deprecated Columns from INFORMATION_SCHEMA.ENGINES

We removed the following columns from INFORMATION_SCHEMA.ENGINES that were only for FB 1.0 compatibility: region, spec, scale, warmup, and attached_to. These columns were always empty. (These columns are hidden and do not appear in SELECT * queries, but they will still work if referenced explicitly.)

Breaking Changes

Improved rounding precision for floating point to integer casting

Casting from floating point to integers now uses Banker’s Rounding, matching PostgreSQL’s behavior. This means that numbers that are equidistant from the two nearest integers are rounded to the nearest even integer:

Examples:

SELECT 0.5::real::int

This returns 0.

SELECT 1.5::real::int

This returns 2.

Rounding behavior has not changed for numbers that are strictly closer to one integer than to all others.

JSON functions update

Removed support for json_extract_raw, json_extract_array_raw, json_extract_values, and json_extract_keys. Updated json_extract function: the third argument is now path_syntax, which is a JSON pointer expression. See JSON_EXTRACT for examples and usage.

Cluster ordinal update

Replaced engine_cluster with cluster_ordinal in information_schema.engine_metrics_history. The new column is an integer representing the cluster number.

Configurable cancellation behavior on connection drop

Introduced the cancel_query_on_connection_drop setting, allowing clients to control query cancellation on HTTP connection drop. Options include NONE, ALL, and TYPE_DEPENDENT. Refer to system settings for examples and usage.

JSON format as default for error output

The HTTP API now returns query execution errors in JSON format by default. This change allows for the inclusion of meta information such as error codes and the location of failing expressions in SQL scripts.

STOP ENGINE will drain currently running queries first

STOP ENGINE command now supports graceful drain, meaning any currently running queries will be run to completion. Once all the queries are completed, the engine will be fully stopped and terminated. If you want to stop the engine immediately, you can issue a STOP ENGINE command use the TERMINATE option. For example, to immediately stop an engine, my_engine, you can use:

 STOP ENGINE myEngine WITH TERMINATE = TRUE

Scaling engines will not terminate currently running queries

ALTER ENGINE command now supports graceful drain, meaning when you scale an engine (vertically or horizontally), any currently running queries will not be terminated. New queries after the scaling operation will be directed to a new cluster, while queries running on the old cluster will be run to completion.

Updated RBAC ownership management

We have introduced several updates to role and privilege management:

  • The security_admin role will be removed temporarily and re-introduced in a later release.
  • Information_object_privileges includes more privileges. Switching to to a specific user database (e.g by executing use database db) will only show privileges relevant for that database. Account-level privileges no longer show up when attached to a specific database.
  • Every newly created user is granted with a public role. This grant can be revoked.

DB version 4.1

June 2024

Resolved issues

  • Fixed an issue causing errors when using WHERE column IN (...) filters on external table scans.

DB version 4.0

June 2024

Enhancements, changes and new integrations

Query Cancelation on HTTP Connection Drop

Going forward, when the network connection between the client and Firebolt is dropped (for example because the Firebolt UI tab was closed or due to network issues), DML queries (INSERT, UPDATE, DELETE, etc) are no longer canceled automatically, but will keep running in the background. You can continue to monitor their progress in information_schema.engine_running_queries or cancel them manually using the cancel query statement if desired. DQL queries (SELECT) are still canceled automatically on connection drop.

New Aggregate Functions: CHECKSUM and hash_agg

CHECKSUM and hash_agg functions are now supported for aggregating indexes. Note that when the hash_agg function doesn’t receive rows, the result is 0.

Breaking Changes

Array Casting Nullability Update

Cast to array will no longer support specifying nullability of the inner type. Example:

a::array(int null)

or

cast(a as array(int not null)) 

will now fail, and need to be rewritten as:

a::array(int) 

or

cast(a as array(int)). 

Postgres-compliant Cast

Casts now behave the same across the product and adhere to the list of supported casts. Some usages of casts (explicit, implicit, or assignment cast) that were previously allowed are no longer supported and now result in errors. For more details on list of supported casts, see the documentation here.

DB version 3.34

May 2024

Enhancements, changes and new integrations

Removed MATCH function

The match function has been removed and replaced with regexp_like.

Producing an error for array function failure instead of NULL

Array function queries that accept two or more array arguments now produce an error. If you call an array function such as array_transform(..) or array_sort(..) with multiple array arguments, the arrays must have the same size. For example:

array_transform(x, y -> x + y, arr1, arr2)

This raises an error if array_length(arr1) != array_length(arr2). We now also perform this check for NULL literals. If you previously used array_transform(x, y -> x + y, NULL::INT[], Array[5, 6]), you got back NULL. Now, the query using that expression will raise an error.

Added ARRAY_FIRST function

The array_first function has been added. It returns the first element in the given array for which the given function returns true.

New name for any_match

A new name for any_match has been added: array_any_match. any_match will be kept as an alias.

Updated ARRAY_SUM return types

The array_sum function of bigint[] now returns a numeric value and array_sum of real[] returns a real value.

Precedence of operators

Breaking change in operator precedence between comparison operators such as =, <, >, and IS operator. New behavior is compatible with Postgres.

Examples of query that changed behavior:

select 1 is null = 2 is null

This used to be true, because it was interpreted as select (1 is null) = (2 is null). It now becomes an error of incompatible types in =

select false = false is not null

The result used to be false - select false = (false is not null), but now is true - select (false = false) is not null.

Dropping the role

Role cannot be dropped if there are permissions granted to the role. The error message will be displayed if you need to manually drop permissions associated to the role.

Coalesce Short-Circuiting

COALESCE now supports short-circuiting in Firebolt. Queries such as COALESCE(a, 1 / 0) FROM t could fail before, even when there were no NULLs in t. Only CASE WHEN supported short circuiting. Firebolt is now aligned with PostgreSQL and supports short circuiting in COALESCE as well.

Create table under I_S schema

You can now execute CREATE TABLE/VIEW/AGGREGATING INDEX only under the public schema.

Improved error message for JSON PARSE_AS_TEXT format

The error message for external tables created with JSON PARSE_AS_TEXT format has been revised. This format reads specifically into a single column of type either TEXT or TEXT NOT NULL. (Note there may be external table partition columns defined after the single TEXT column, and they are okay). Now, only the error message regarding the CREATE EXTERNAL TABLE statement on a user’s first attempt to use SELECT will be seen. Support for reading format JSON PARSE_AS_TEXT=TRUE into a TEXT NOT NULL column has been added.

Implemented column_mismatch

Support for ALLOW_COLUMN_MISMATCH in COPY INTO has been added.

Corrected NULL behavior of STRING_TO_ARRAY

The behavior of string_to_array now matches its behavior in PostgreSQL. The change affects NULL delimiters where the string is split into individual characters, as well as empty strings and where the output is now an empty array.

Changed city_hash behavior for nullable inputs

The behavior for city_hash has changed for nullable inputs. For example:

SELECT CITY_HASH([null]) = CITY_HASH([''])

This is now false.

Function ARRAY_AGG now preserves NULLS

The array_agg function has been changed to return PostgreSQL-compliant results:

  • array_agg now preserves NULL values in its input, e.g. select array_agg(x) from unnest(array [1,NULL,2] x) returns {1,NULL,2}
  • array_agg now returns NULL instead of an empty array if there are no input values

Lambda parameters are no longer supported by array_sum

Array aggregate functions no longer support lambda parameters. To get the old behavior for conditional lambda functions, use transform instead. For example:

array_sum(transform(...))

Explicit Parquet conversion from DATE to INT is now needed

A breaking change has been implemented in raising an error on reading a Parquet/ORC DATE/TIMESTAMP column if the EXTERNAL TABLE expects the column to have type INT/BIGINT. DATE/TIMESTAMP cannot be cast to INT/BIGINT, and external table scans will no longer allow this cast either. You need to explicitly transform the Parquet/ORC DATE/TIMESTAMP column with EXTRACT(EPOCH FROM col) to insert it into an INT/BIGINT column.

Resolved issues

  • Fixed a bug where negation did not check for overflows correctly.

DB version 3.33

April 2024

Enhancements, changes and new integrations

Removed ‘element_at’ Function

The element_at function for arrays has been removed and replaced with the [] operator.

Change of return type from BIGINT to INTEGER

The index_of/array_position function now returns INTEGER instead of BIGINT.

Removed LIMIT DISTINCT syntax

The LIMIT_DISTINCT syntax is no longer supported by Firebolt.

Updated CAST function behavior

All cast logic has been moved to runtime in Firebolt. The castColumn function is now replaced by fbCastColumn, ensuring consistent casting behavior and resolving issues with the COPY FROM operation and other cast calls. Uses of implicit/explicit CAST may result in errors due to this fix.

New breaking change.

Resolved issues

  • Fixed a bug in array_position where searching for NULL in an array with non-null elements incorrectly returned a match in some cases.

DB version 3.32

April 2024

New features

Expose and document ‘typeof’ as a toTypeName function

The typeof function has been added, which returns the data type of a SQL expression as a string.

Enhancements, changes and new integrations

Spilling Aggregations

Firebolt can now process most aggregations that exceed the available main memory of the engine by spilling to the SSD cache when needed. This happens transparently to the user. A query that made use of this capability will populate the spilled_bytes column in information_schema.query_history. Spilling does not support aggregations where a single group exceeds the available memory (e.g., select count(distinct high_cardinality_column) from huge_table) and may not yet work reliably for all aggregate functions or engine specs. We will continue improving the feature in upcoming releases.

No overflow detected in cast from FLOAT to DECIMAL

Fix results of casting from float32 to decimals with precision > 18. In addition to the correct results breaking change, there are certain queries that was working before that now will fail involving overflow.

Example query:

  • SELECT 17014118346046923173168730371588410572::REAL::DECIMAL(37,0).

Previously, this was working and returned a wrong result, but now it will fail with an overflow error.

ARRAY_COUNT returns 0 instead of NULL

ARRAY_COUNT on NULL array now returns 0 instead of NULL.

No overflow check in arithmetic operations

Arithmetic operators (i.e. multiplication, addition, subtraction, and division) now perform correct overflow checking. This means that queries that used to return wrong results in the past now throw runtime errors.

Example queries:

  • SELECT 4294967296 * 4294967296 -> now throws an error, before it would return 0

  • SELECT 9223372036854775807 + 9223372036854775807 -> now throws an error, before it would return -2

  • SELECT (a + b) * c -> this might throw runtime errors if there are large values in the column, but this is highly data dependent.

Implement bool_or/bool_and aggregation functions

New aggregate functions bool_or and bool_and have been added.

Remove old deprecate REGENERATE AGGREGATING INDEX

‘REGENERATE AGGREGATING INDEX’ syntax has now been removed.

Align the syntax of our “escape” string literals with PostgreSQL

Escape string literals now support octal and Unicode escape sequences. As a result, escape string literals now behave exactly like PostgreSQL. Example: SELECT E'\U0001F525b\x6F\154t'; returns 🔥bolt. If the setting standard_conforming_strings is not enabled for you, regular string literals (e.g., SELECT 'foo';) will also recognize the new escape sequences. However, we recommend exclusively using escape string literals for using escape sequences. Please be aware that you will get different results if you previously used (escape) string literals containing the syntax we now use for Unicode and octal escape sequences.

Change return value of length and octet_length to INT

Length and array_length now return INTEGER instead of BIGINT.

Subqueries in the GROUP BY/HAVING/ORDER BY clauses change

Subqueries in GROUP BY/HAVING/ORDER BY can no longer references columns from the selection list of the outer query via their aliases as per PG compliance. select 1 + 1 as a order by (select a); used to work, but now fails with unresolved name a error.

Bytea serialization to CSV fix

Changed Bytea to CSV export: from escaped to non escaped.

Example:

  • COPY (select ‘a’::bytea) to ‘s3…’; the results will now be “\x61” instead of “\x61”.

Resolved issues

  • Fixed results of casting literal float to numeric. In the past the float literal was casted to float first then to numeric, this caused us to lose precision.

Examples:

  • SELECT 5000000000000000000000000000000000000.0::DECIMAL(38,1); -> 5000000000000000000000000000000000000.0
  • SELECT (5000000000000000000000000000000000000.0::DECIMAL(38,1)+5000000000000000000000000000000000000.0::DECIMAL(38 1)); -> ERROR: overflow.

Note that before, it was not an error and resulted in: 9999999999999999775261218463046128332.8.

  • Fixed a longstanding bug with >= comparison on external table source_file_name. Whereas this would previously have scraped fewer files than expected off the remote S3 bucket, you will now get all files properly (lexicographically) compared against the input predicate.
  • Fixed a bug when USAGE ANY ENGINE (and similar) privileges were shown for * account. Now it is being show for current account.
  • Fixed a bug involving ‘btrim’ string characters, where invoking btrim, ltrim, rtrim, or trim with a literal string but non-literal trim characters could result in an error.

DB version 3.31

March 2024

New features

PG compliant division

LQP2 has a new division operator that is PG compliant, by default.

Prevents usage of new line delimeter for schema inference

An error will now occur if schema inference is used with the option “delimiter” set to something other than the default.

Enhancements, changes and new integrations

Simplified table protobuf representation

Unique constraints in tables will be blocked for new accounts.

Support for nullable arrays

Support has been added to allow the ARRAY_ANY_MATCH lambda function to work with nullable arrays.

Updated AWS billing error message

The error message for an AWS billing issue on Engine Start was on Engine Start was changed to add more information and clarity.

New requirements updated for EXPLAIN

For EXPLAIN queries, we now allow only one of the following options at the same time: ALL, LOGICAL, PHYSICAL, ANALYZE.EXPLAIN (ALL) now returns the plans in multiple rows instead of multiple columns.

Disabled Unix Time Functions

The following functions are not supported anymore: ‘from_unixtime’ ‘to_unix_timestamp’ ‘to_unix_time’

Renamed spilled metrics columns

The columns spilled_bytes_uncompressed and spilled_bytes_compressed of information_schema.query_history have been replaced by a single column spilled_bytes. It contains the amount of data that was spilled to disk temporarily while executing the query.

Aggregating index placement

Aggregating index is now placed in the same namespace as tables and views.

Syntax and planner support for LATERAL scoping

LATERAL is now a reserved keyword. It must now be used within double-quotes when using it as an object identifier.

Resolved issues

Changed return for division by 0 from null to fail.

Updated error log for upload failure for clarity.

Fixed a bug in ‘unnest’ table function that occurred when not all of the ‘unnest’ columns were projected.

Changed the behavior of split_part when an empty string is used as delimiter.

Fixed a bug where floating point values -0.0 and +0.0, as well as -nan and +nan were not considered equal in distributed queries.

TRY_CAST from TEXT to NUMERIC now works as expected: if the value cannot be parsed as NUMERIC it produces null.

DB version 3.30

November 2023

New features

New comparison operators

New comparison operators IS DISTINCT FROM and IS NOT DISTINCT FROM have been added.

Enhancements, changes and new integrations

Support for nullable arrays

Support has been added to allow the ANY_MATCH lambda function to work with nullable arrays

Resolved issues

  • Indirectly granted privileges have been removed from the information_schema.object_privileges view.

  • Fixed an issue where ARRAY_FIRST and ARRAY_FIRST_INDEX returned an error if the given input was nullable.

DB version 3.29

October 2023

New features

EXPLAIN ANALYZE now available for detailed query metrics

You can now use the EXPLAIN command to execute EXPLAIN (ANALYZE) <select statement> and get detailed metrics about how much time is spent on each operator in the query plan, and how much data is processed. The query plan shown there is the physical query plan, which you can inspect using EXPLAIN (PHYSICAL) <select statement> without executing the query. It shows how query processing is distributed over the nodes of an engine.

Enhancements, changes and new integrations

Virtual column ‘source_file_timestamp’ uses new data type

The virtual column source_file_timestamp has been migrated from the data type TIMESTAMP (legacy timestamp type without time zone) to the type TIMESTAMPTZ (new timestamp type with time zone).

Despite the increased resolution, the data is still in second precision as AWS S3 provides them only as unix seconds.

Use source_file_timestamp - NOW() instead of DATE_DIFF('second', source_file_timestamp, NOW())

New function added

A new alias ARRAY_TO_STRING has been added to function ARRAY_JOIN.

DB version 3.28

September 2023

Resolved issues

  • IN expressions with scalar arguments now return Postgres-compliant results if there are NULLs in the IN list.

  • information_schema.running_queries returns ID of a user that issued the running query, not the current user.

  • Update error message to explain upper case behavior