Skip to main content
This guide explains all the compression codecs available in Firebolt, helping you understand what each one does, when to use them, and how they perform with different types of data.
For practical examples, getting started guidance, and compression best practices, see the Table and Column Compression overview.

General-purpose codecs

LZ4

Syntax: lz4 or lz4() Description: Fast lossless compression algorithm optimized for speed over compression ratio. Default compression codec in Firebolt. Parameters: None Performance characteristics:
  • Compression speed: Very fast
  • Decompression speed: Very fast
  • Typical compression ratio: Good
  • CPU usage: Very low
Best for:
  • High-throughput data ingestion
  • Frequently accessed data requiring fast queries
  • Real-time analytics workloads
  • When CPU resources are limited
Data type compatibility: All types

LZ4HC (LZ4 High Compression)

Syntax: lz4hc(level) where level is 1-12 Description: High-compression variant of LZ4 that trades slower compression for better compression ratios while maintaining fast decompression. Parameters:
  • level: Compression level (1-12, default: 9)
    • Levels 1-6: Faster compression, lower ratio
    • Levels 7-9: Balanced performance (recommended)
    • Levels 10-12: Maximum compression, slower writes
Performance characteristics:
  • Compression speed: Moderate
  • Decompression speed: Very fast
  • Typical compression ratio: Better than LZ4
  • CPU usage: Low to moderate
Best for:
  • Write-heavy workloads where read performance is critical
  • Balanced storage and performance requirements
  • Data with moderate access patterns
Data type compatibility: All types Examples:
-- Balanced performance
column_name TYPE COMPRESSION lz4hc(6)

-- Maximum LZ4HC compression  
column_name TYPE COMPRESSION lz4hc(12)

ZSTD (Zstandard)

Syntax: zstd(level) where level is 1-22 Description: Modern compression algorithm providing excellent balance between compression ratio and speed with highly configurable compression levels. Parameters:
  • level: Compression level (1-22, default: 1)
    • Levels 1-5: Fast compression, good for online workloads
    • Levels 6-10: Balanced performance
    • Levels 11-15: High compression for archival data
    • Levels 16-22: Maximum compression, very slow
Performance characteristics by level:
LevelCompression SpeedCompression RatioUse Case
1FastGoodReal-time data
3FastBetterGeneral workloads
7ModerateBetterBalanced storage/speed
15SlowExcellentArchival data
22Very slowExcellentMaximum compression
Best for:
  • General analytics workloads (levels 1-5)
  • Archival data storage (levels 10+)
  • Data with good compressibility (JSON, text, logs)
Data type compatibility: All types Examples:
-- Fast ZSTD for hot data
log_data TEXT COMPRESSION zstd(3)

-- High compression for cold data
archived_events TEXT COMPRESSION zstd(15)

NONE

Syntax: none Description: Disables compression entirely. Data is stored uncompressed. Parameters: None Performance characteristics:
  • Compression speed: N/A (no compression)
  • Decompression speed: N/A (no decompression)
  • Compression ratio: None (no compression)
  • CPU usage: None
Best for:
  • Already compressed data (images, videos, compressed files)
  • Data that compresses poorly
  • Debugging compression issues
  • Maximum query performance at cost of storage
Data type compatibility: All types Limitations: Cannot be chained with other codecs

DEFAULT

Syntax: default Description: Uses system default compression, which is equivalent to lz4. Parameters: None Best for:
  • Inheriting table-level compression settings
  • Removing column-specific compression overrides

Specialized (preprocessing) codecs

Delta

Syntax: delta Description: Stores differences between consecutive values instead of absolute values. Highly effective for sequential or slowly-changing numeric data. Parameters: None Performance characteristics:
  • Compression speed: Fast
  • Decompression speed: Fast
  • Compression ratio: Highly variable, excellent for sequential data
  • CPU usage: Very low
Data type compatibility: Integers, dates, timestamps (NOT NULL only) Best for:
  • Sequential IDs (user_id, order_id, event_id)
  • Monotonic counters
  • Timestamps in time-series data
  • Any slowly changing numeric values
Must be chained with a general-purpose codec Examples:
-- Sequential user IDs
user_id INTEGER NOT NULL COMPRESSION (delta, lz4)

-- Timestamp optimization
created_at TIMESTAMPTZ NOT NULL COMPRESSION (delta, zstd(3))

DoubleDelta

Syntax: doubledelta Description: Stores differences of differences, optimized for monotonic sequences with relatively constant intervals (like regular time series). Parameters: None Performance characteristics:
  • Compression speed: Fast
  • Decompression speed: Fast
  • Compression ratio: Excellent for regular time series
  • CPU usage: Low
Data type compatibility: Integers, dates, timestamps (NOT NULL only) Best for:
  • Regular time-series timestamps (every minute, hour, day)
  • Monotonic sequences with consistent intervals
  • Event timestamps in ordered data
Must be chained with a general-purpose codec Examples:
-- Regular time series timestamps
timestamp TIMESTAMPTZ NOT NULL COMPRESSION (doubledelta, lz4)

-- Sequential event IDs with consistent gaps
event_sequence INTEGER NOT NULL COMPRESSION (doubledelta, lz4hc(6))

Gorilla

Syntax: gorilla Description: Optimized for floating-point values that change slowly over time. Uses XOR-based encoding to efficiently store small changes between consecutive values. Parameters: None Performance characteristics:
  • Compression speed: Moderate
  • Decompression speed: Fast
  • Compression ratio: Excellent for time-series floats
  • CPU usage: Low to moderate
Data type compatibility: FLOAT, DOUBLE, dates, timestamps Best for:
  • Sensor readings that change gradually
  • Financial time series (prices, rates)
  • IoT device metrics
  • Any slowly-changing floating-point data
Must be chained with a general-purpose codec Examples:
-- Temperature sensor readings
temperature FLOAT NOT NULL COMPRESSION (gorilla, lz4)

-- Stock prices  
price DOUBLE NOT NULL COMPRESSION (gorilla, zstd(3))

-- Time series with timestamps
timestamp TIMESTAMPTZ NOT NULL COMPRESSION (gorilla, lz4hc(6))

Chaining strategies

Most effective combinations for common use cases:
-- Sequential data with fast access
id INTEGER NOT NULL COMPRESSION (delta, lz4)

-- Time series with balanced compression
sensor_value FLOAT NOT NULL COMPRESSION (gorilla, zstd(3))

-- Regular timestamps  
recorded_at TIMESTAMPTZ NOT NULL COMPRESSION (doubledelta, lz4hc(6))

Three-stage chains (Advanced)

For maximum compression on specific data patterns:
-- Complex time series optimization
timestamp TIMESTAMPTZ NOT NULL COMPRESSION (doubledelta, gorilla, zstd(7))

-- Highly sequential float data
sequence_value DOUBLE NOT NULL COMPRESSION (delta, gorilla, lz4hc(9))
Note: Three-stage chains increase CPU overhead. Test performance before production use.

Backward compatibility

Legacy syntax support

For backward compatibility, the older compression syntax remains supported for lz4 and zstd codecs only:
-- Legacy syntax (supported for lz4 and zstd only)  
CREATE TABLE old_style (
    id INTEGER COMPRESSION zstd COMPRESSION_LEVEL 5,
    name TEXT COMPRESSION lz4
);

-- New syntax (recommended for all features)
CREATE TABLE new_style (
    id INTEGER COMPRESSION (zstd(5)),
    name TEXT COMPRESSION (lz4)
);
Legacy syntax limitations:
  • Only lz4 and zstd codecs supported
  • No specialized codecs (delta, gorilla, doubledelta)
  • No compression chaining capabilities
  • No access to advanced parameters
New syntax benefits:
  • Support for all compression codecs
  • Compression chaining capabilities
  • Access to specialized codecs
  • Consistent parameter syntax
  • Future feature compatibility

See also

I