Overview

Firebolt allows users to specify a custom compression algorithm at both the table level and the column level when creating a table. Compression helps you reduce storage costs and improve query performance by decreasing disk usage and I/O.

Supported compression algorithms

Firebolt supports the following compression algorithms:
  • lz4– Default option; fast compression and decompression, ideal for general workloads.
  • zstd– Higher compression ratios at the cost of increased CPU usage; useful for large datasets.
Disabling compression with NONE is not supported. Firebolt always compresses data using one of the supported algorithms.

Specify compression settings

You can specify compression settings for a table, columns, or a combination of both when creating tables:

Table-level compression

You specify table-level compression using the WITH clause in the CREATE TABLE statement. This compression applies to all columns unless explicitly overriden at the column level.
Syntax:
Use the following syntax to set table-level and/or column-level compression.
CREATE TABLE table_name (
    column1 data_type COMPRESSION algorithm COMPRESSION_LEVEL level,
    column2 data_type COMPRESSION algorithm,
    column3 data_type
)
WITH (COMPRESSION = algorithm, COMPRESSION_LEVEL =level);
Compression level
  • The optional COMPRESSION_LEVEL parameter lets you fine-tune the balance between compression efficiency and CPU usage.
  • Higher values provide greater comparison but increase CPU overhead.
  • Firebolt uses algorithm-specific default levels if no level is specified.
Default behavior
  • If you don’t specify column-level compression. Firebolt applies the table-level setting.
  • If no table-level setting is provided, Firebolt defaults to lz4 .
Example:
The following code creates a table with ZTSD compression:
CREATE TABLE sales (
    a INT
) WITH (compression=zstd, compression_level=15);

Column-level compression

Column-level compression allows you to override table-level compression settings for specific columns. This approach is useful for columns that have different compression requirements, such as large text fields or numerical columns.
Syntax:
CREATE TABLE table_name (
    column1 data_type COMPRESSION algorithm COMPRESSION_LEVEL level,
    column2 data_type COMPRESSION algorithm,
    column3 data_type
);
Compression level
  • The compression_level parameter fine-tunes compression efficiency versus CPU usage.
  • Higher compresson levels reduce storage but consume more CPU resources during compression and decompression.
  • If not specified, Firebolt applies a default compression level appropriate for the selected algorithm.
Default behavior
  • If you do not specify compression explicitly, Firebolt applies the default lz4 compression algorithm.
  • When you define table-level compression, all columns inherit this setting unless overriden with column-level compression.

Step-by-step examples of compression

The following examples show how to use table-level and column-level compression in Firebolt. Step 1: Create and populate an uncompressed table Create a base table without custom compression and populate it with data using GENERATE_SERIES
-- create the target table
CREATE TABLE origin_no_compression (a int, b int, c string, d string);
-- insert 10000 rows
INSERT INTO origin_no_compression SELECT 1, 2, 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.', 'In dapibus metus quis velit mattis dignissim.' FROM GENERATE_SERIES(1,10000) ;
Step 2: Apply table-level compression Create a new table using ZTSD compression and compression level 15, applying the setting to all columns:
CREATE TABLE table_a_column_level_zstd(a int COMPRESSION zstd COMPRESSION_LEVEL 15, b int, c string, d string);
CREATE TABLE table_a_b_column_level_zstd(a int COMPRESSION zstd COMPRESSION_LEVEL 15, b int COMPRESSION zstd COMPRESSION_LEVEL 15, c string, d string);
CREATE TABLE table_c_column_level_zstd(a int , b int, c string COMPRESSION zstd COMPRESSION_LEVEL 15, d string);
CREATE TABLE table_c_d_column_level_zstd(a int , b int, c string COMPRESSION zstd COMPRESSION_LEVEL 15, d string COMPRESSION zstd COMPRESSION_LEVEL 15);
Insert data into these tables from the origin_no_compression table:
INSERT INTO table_a_column_level_zstd SELECT * FROM origin_no_compression;
INSERT INTO table_a_b_column_level_zstd SELECT * FROM origin_no_compression;
INSERT INTO table_c_column_level_zstd SELECT * FROM origin_no_compression;
INSERT INTO table_c_d_column_level_zstd SELECT * FROM origin_no_compression;
Step 3: Apply column-level compression Next, create tables demonstrating column-level compression configuration:
CREATE TABLE table_a_b_column_level_zstd (
    a INT COMPRESSION zstd COMPRESSION_LEVEL 15,
    b INT COMPRESSION zstd COMPRESSION_LEVEL 15,
    c STRING,
    d STRING
);
Step 4: Compare compression results Check the compressed_byte value of the tables to evaluate the effectiveness of compression settings:
SELECT compressed_bytes,table_name
FROM information_schema.tables
WHERE table_name = 'origin_no_compression'
   OR table_name = 'table_a_column_level_zstd'
   OR table_name = 'table_a_b_column_level_zstd'
   OR table_name = 'table_c_column_level_zstd'
   OR table_name = 'table_c_d_column_level_zstd'
ORDER BY table_name;

Example results

These are typical results from a single-node engine, showing how different compression methods and column-level settings affect table sizes:
compressed_bytestable_name
4772origin_no_compression
4396table_a_b_column_level_zstd
4584table_a_column_level_zstd
2558table_c_column_level_zstd
763table_c_d_column_level_zstd
Column-level compression significantly reduces storage for specific columns, as demonstrated in these examples, and ZSTD typically offers higher compression efficiency compared to the default LZ4.