Skip to main content
Firebolt is one engine that runs the same way everywhere. The query engine (planner, runtime, storage) is a single binary. What changes between a laptop and a 100-node cluster is where metadata lives and how many clusters share the same data on object storage.

Philosophy

  • Minimal complexity. Runs as a single binary on your laptop, or as a 100-node cluster in the cloud. Same binary, same behavior.
  • Universal deployment. Run Firebolt anywhere: bare metal, VMs, Kubernetes, any cloud, any data center.
  • Fewest moving parts. On a laptop, metadata lives in SQLite. On a cluster, it lives in PostgreSQL. There are no other dependencies. Everything is built around object storage, which is durable and fault tolerant.
  • Cloud native when you need it. Kubernetes plus object storage is a full cloud data warehouse: workload isolation, dynamic scaling, self-hosted anywhere.

Separation of storage and compute

Tablets (Firebolt’s storage unit) live in object storage such as S3, GCS, or Azure Blob Storage, and are cached on local SSD with LRU eviction and prefetching. Storage scales independently of compute, and a cluster holds no durable state beyond its cache, so clusters can start, stop, and resize on demand without moving data. Because every cluster reads the same tablets from object storage, the deployment modes below differ only in where metadata lives and how many clusters you run.

Deployment modes

Single binary: laptop, dev, embedded

One process holds the query engine and stores metadata in SQLite. Data and cache live on local disk. Zero dependencies.
+-----------------------------------+
| FIREBOLT                          |
| +-------------------------------+ |
| | QUERY ENGINE                  | |
| | planner --> runtime --> store | |
| +-------------------------------+ |
| +-------------------------------+ |
| | SQLITE   (metadata)           | |
| +-------------------------------+ |
+-----------------+-----------------+
                  |
                  v
         +-----------------+
         | LOCAL SSD/DISK  |
         +-----------------+

Production: shared metadata, independent clusters

A shared PostgreSQL metadata store coordinates independent clusters. Each cluster keeps its own SSD cache and reads the same tablets from object storage, so an ingest, batch, or real-time serving workload runs in isolation without contending for the others’ compute.
                   +-------------------+
                   | POSTGRESQL        |
                   | metadata store    |
                   +---------+---------+
                             |
        +--------------------+--------------------+
        |                    |                    |
 +------+------+      +------+------+      +-------+------+
 | INGEST      |      | BATCH       |      | REAL-TIME    |
 | write       |      | hourly      |      | serving      |
 | SSD cache   |      | SSD cache   |      | SSD cache    |
 +------+------+      +------+------+      +-------+------+
        |                    |                    |
        +--------------------+--------------------+
                             |
                             v
                   +-------------------+
                   | S3 / GCS / ABS    |
                   | object store      |
                   +-------------------+

Kubernetes operator: managed cluster lifecycle

The Firebolt Operator owns cluster lifecycle: autoscaling, zero-downtime upgrades, and cluster creation. Metadata runs in PostgreSQL, managed by the operator or hosted externally. Object storage stays external and durable.
+---------------------------------------------------------------+
| KUBERNETES                                                    |
|  +---------------------------------------------------------+  |
|  | FIREBOLT OPERATOR                                       |  |
|  | autoscaling . upgrades . cluster lifecycle              |  |
|  +----+-----------------+------------------+---------------+  |
|       |                 |                  |                  |
|  +----+-----+     +-----+----+      +------+-----+            |
|  | INGEST   |     | BATCH    |      | SERVING    |            |
|  | SSD cache|     | SSD cache|      | SSD cache  |            |
|  +----------+     +----------+      +------------+            |
|                                                               |
|  +---------------------------------------------------------+  |
|  | POSTGRESQL . metadata (operator-managed or external)    |  |
|  +---------------------------------------------------------+  |
+-------------------------------+-------------------------------+
                                |
                                v
                      +-------------------+
                      | S3 / GCS / ABS    |
                      | object store      |
                      +-------------------+
For setup, see the Helm chart and Firebolt Operator docs.

Managed service

The managed service runs the same engine, operator, and object storage, with Firebolt operating the control plane so you do not run PostgreSQL, the operator, or upgrades yourself. On top of that, it adds an organization-wide layer that single-cluster deployments do not need:
  • Organizations and accounts. An organization contains multiple accounts that isolate environments and teams (for example, dev, staging, and production) while sharing one administrative boundary.
  • Consolidated billing. Usage across all accounts rolls up to the organization, with per-account visibility.
  • Unified identity. One set of users, single sign-on, and multi-factor authentication across accounts, with role-based access control.
  • Regions. Accounts are provisioned into specific cloud regions, and global objects (such as users and roles) span them.
See Managed service for the full object model.

Workload isolation

Independent clusters operate on the same object-storage tablets, so resource-intensive maintenance such as a vacuum or an index backfill on a dedicated cluster never draws compute from the cluster serving queries. This is the positive form of separation of storage and compute: scale, isolate, and tune each workload without copying data.