Skip to main content

Overview

Firebolt allows users to specify a custom compression algorithm at both the table level and the column level when creating a table. Compression helps you reduce storage costs and improve query performance by decreasing disk usage and I/O.

Supported compression algorithms

Firebolt supports the following compression algorithms:
  • lz4– Default option; fast compression and decompression, ideal for general workloads.
  • zstd– Higher compression ratios at the cost of increased CPU usage; useful for large datasets.
Disabling compression with NONE is not supported. Firebolt always compresses data using one of the supported algorithms.

Specify compression settings

You can specify compression settings for a table, columns, or a combination of both when creating tables:

Table-level compression

You specify table-level compression using the WITH clause in the CREATE TABLE statement. This compression applies to all columns unless explicitly overridden at the column level.

Syntax:

Use the following syntax to set table-level and/or column-level compression.
CREATE TABLE table_name (
    column1 data_type COMPRESSION algorithm COMPRESSION_LEVEL level,
    column2 data_type COMPRESSION algorithm,
    column3 data_type
)
WITH (COMPRESSION = algorithm, COMPRESSION_LEVEL = level);

Compression level

  • The optional COMPRESSION_LEVEL parameter lets you fine-tune the balance between compression ratio and CPU usage.
  • Higher compression levels provide greater compression but increase CPU overhead.
  • Firebolt applies algorithm-specific default levels if no level is specified.

Default behavior

  • If you don’t specify column-level compression, Firebolt applies the table-level setting.
  • If no table-level setting is provided, Firebolt defaults to lz4.

Example

The following code creates a table with ZTSD compression:
CREATE TABLE sales (a INT) WITH (COMPRESSION=zstd, COMPRESSION_LEVEL=1);

Column-level compression

Column-level compression allows you to override table-level compression settings for specific columns. This approach is useful for columns that have different compression requirements, such as large text fields or numerical columns.

Syntax:

CREATE TABLE table_name (
    column1 data_type COMPRESSION algorithm COMPRESSION_LEVEL level,
    column2 data_type COMPRESSION algorithm,
    column3 data_type
);
Compression level
  • The compression_level parameter fine-tunes compression efficiency versus CPU usage.
  • Higher compresson levels reduce storage but consume more CPU resources during compression and decompression.
  • If not specified, Firebolt applies a default compression level appropriate for the selected algorithm.
Default behavior
  • If you do not specify compression explicitly, Firebolt applies the default lz4 compression algorithm.
  • When you define table-level compression, all columns inherit this setting unless overridden with column-level compression.

Step-by-step example

The following example shows how to use table-level and column-level compression in Firebolt. Step 1: Create and populate table with default compression Create a base table without custom compression and insert 10000 identical rows into it.
CREATE TABLE table_default_compression (a int, b int, c string, d string);
-- insert 10000 rows
INSERT INTO table_default_compression SELECT 1, 2, 'Lorem ipsum dolor sit amet, consectetur adipiscing elit.', 'In dapibus metus quis velit mattis dignissim.' FROM GENERATE_SERIES(1,10000) ;
Step 2: Apply table-level compression Create a table that applies ZSTD compression at level 1 to all columns via table-level setting.
This example uses ZSTD with compression level 1. You can use a higher compression level to achieve better compression at the cost of increased CPU usage. However, ZSTD compression levels higher than 3 typically reach the point of diminishing returns.
CREATE TABLE table_zstd1(a int, b int, c string, d string) WITH (COMPRESSION = zstd, COMPRESSION_LEVEL = 1);
Step 3: Apply column-level compression Create tables that apply ZSTD compression at level 1 to different sets of columns:
CREATE TABLE table_c_column_zstd1(a int, b int, c string COMPRESSION zstd COMPRESSION_LEVEL 1, d string);
CREATE TABLE table_c_d_column_zstd1(a int, b int, c string COMPRESSION zstd COMPRESSION_LEVEL 1, d string COMPRESSION zstd COMPRESSION_LEVEL 1);
Step 4: Insert the data Insert data into these tables from the table_default_compression table:
INSERT INTO table_zstd1 SELECT * FROM table_default_compression;
INSERT INTO table_c_column_zstd1 SELECT * FROM table_default_compression;
INSERT INTO table_c_d_column_zstd1 SELECT * FROM table_default_compression;
Step 5: Compare compression results Check the compression ratios of the tables:
SELECT
	table_name, ROUND(uncompressed_bytes::DOUBLE / compressed_bytes, 1) as compression_ratio, compressed_bytes
FROM information_schema.tables
WHERE
	table_name IN ('table_default_compression', 'table_zstd1', 'table_c_column_zstd1', 'table_c_d_column_zstd1')
ORDER BY table_name;
table_namecompression_ratiocompressed_bytes
table_c_column_zstd1449.42559
table_c_d_column_zstd11505.2764
table_default_compression2414772
table_zstd12948.7780
As demonstrated in these examples, column-level compression significantly reduces storage for specific columns. ZSTD typically offers higher compression efficiency compared to the default LZ4.