Run Firebolt from the standalone binaries

For production, run Firebolt on Kubernetes with the Firebolt Operator or the Helm chart. They handle node startup, health checks, upgrades, and metadata for you. Run the raw binaries only when you cannot use Kubernetes or Docker, or for local testing.

This guide runs Firebolt directly from binaries, with no Docker and no Kubernetes:

firebolt, the database binary that executes queries. It bundles the planner, runtime, and storage engine.
dedicated-pensieve, an optional standalone metadata service, for running multiple independent Firebolt clusters with decoupled metadata.

Concepts

An Engine is a cluster of one or more nodes that execute queries together. Every node runs the same firebolt binary and is given its position with --node. A query submitted to any node is planned there and its stages are distributed across that Engine’s nodes. These pages assume one node per machine: one firebolt process on its own host, reached at that host’s address. That is the only model a real deployment should use. You can also place several nodes on a single machine, but only for testing, and it needs extra care to avoid port collisions. That case is covered separately in Colocate multiple nodes on one host. Table data lives in object storage (S3, GCS, or Azure Blob Storage) as immutable tablets. Nodes read tablets directly from object storage and cache them on local SSD, so adding a node adds compute without moving data. The directory you pass with --data-dir holds only this cache and the node’s configuration; object storage is the single source of truth. Metadata, the catalog of tables, columns, and tablet locations, is served in one of two modes, set by instance.type in the engine configuration:

Embedded metadata (the default): the Engine hosts its own metadata service, backed by a local SQLite database. No separate process and no Postgres. The metadata belongs to that one Engine and cannot be shared. This is the simplest deployment.
Standalone metadata: the Engine connects to a separate dedicated-pensieve process backed by Postgres. Because the metadata lives outside any single Engine, several Engines can share one catalog and one bucket, and each reads the latest snapshot written by the others. This is the basis of workload isolation: two Engines operate on the same tablets without drawing compute from each other.

Recommended topology

Run one firebolt node per machine. Every node reads and writes the same object storage bucket. The metadata mode decides whether there is a separate process.

Embedded metadata
Standalone metadata

    Machine A           Machine B           Machine C
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   firebolt    │   │   firebolt    │   │   firebolt    │
│   (node 0)    │   │   (node 1)    │   │   (node 2)    │
│  + metadata   │   │               │   │               │
└───────┬───────┘   └───────┬───────┘   └───────┬───────┘
        │                   │                   │
        └───────────────────┼───────────────────┘
                            ▼
                ┌───────────────────────┐
                │    Object storage     │
                │   (S3 / GCS / ...)    │
                └───────────────────────┘

Node 0 hosts the embedded metadata in a local SQLite database. No separate process and no Postgres.

    Machine A           Machine B           Machine C
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   firebolt    │   │   firebolt    │   │   firebolt    │
│   (node 0)    │   │   (node 1)    │   │   (node 2)    │
└───────┬───────┘   └───────┬───────┘   └───────┬───────┘
        │                   │                   │
        └───────────────────┼───────────────────┘
                            │
             ┌──────────────┴──────────────┐
             ▼                             ▼
 ┌───────────────────────┐     ┌───────────────────────┐
 │    Object storage     │     │   Metadata service    │
 │   (S3 / GCS / ...)    │     │ (dedicated-pensieve)  │
 └───────────────────────┘     └───────────────────────┘

Run one dedicated-pensieve process (on one of the machines or a separate host), backed by Postgres. Every node connects to it and to object storage.

Choose a deployment

Deploy with embedded metadata: one Engine, no separate metadata process. Start here.
Deploy with standalone metadata: one or more Engines sharing a Postgres-backed metadata service and one bucket.
Colocate multiple nodes on one host: run several nodes or Engines on one machine for testing only. Most deployments never need this.

How the nodes communicate

A node serves clients over HTTP, exchanges work with its peers over two channels, reaches the metadata service over gRPC, and reads and writes tablets directly to object storage. Each port below is per node. The engine binds them on all interfaces (0.0.0.0) by default, so with one node per machine the defaults never collide and you do not have to change them.

Port	Default	Protocol	Direction	Purpose
`--http-port`	3473	HTTP	Client to node	Submit SQL and read results. `/ping` returns `Ok.`
`aragog_port`	5678	gRPC	Node to node	Distributed execution control: schedule, cancel, and discard query stages
`shufflepuff_port`	16000	TCP	Node to node	Data exchange (shuffle) between stages of a distributed query
`storage_manager_port`	1717	gRPC	Node to node	Cluster storage coordination (tablet assignment, statistics). Bound only by the leader
`storage_agent_port`	3434	gRPC	Node to node	Per-node storage and cache agent
`health_check_port`	8122	HTTP	Local	Liveness and readiness probes. Not part of query execution
`prometheus_port`	9090	HTTP	Local	Metrics scrape endpoint. Not part of query execution

The metadata service uses one more port, depending on the mode:

Port	Default	Protocol	Direction	Purpose
Embedded metadata	6500	gRPC	Node to node	In embedded mode node 0 hosts it; other nodes connect to `<node 0 host>:6500`
Standalone metadata	7000	gRPC	Node to service	In standalone mode every node connects to the `dedicated-pensieve` process
Postgres	5432	Postgres	Service to database	`dedicated-pensieve` stores metadata here

Across machines, the firewall must allow the node-to-node ports between Engine nodes, the metadata port from every node to the metadata host, and Postgres from the standalone metadata host to its database. Use the address each node’s peers reach it on, not localhost. All nodes of one Engine must start concurrently. Each node’s readiness check runs a distributed query that needs its peers reachable, so starting one node and waiting for it before starting the next deadlocks. Start every node of an Engine, then poll each node’s /ping until all return Ok.. Stop a node by sending SIGTERM to the process ID that firebolt server start prints (or that you captured when launching it in the background).

Get the binaries

Put these binaries on your PATH:

firebolt, the engine.
dedicated-pensieve, the standalone metadata service (only needed for standalone metadata).

If Firebolt provides prebuilt standalone archives for your environment, extract each archive and move the binary onto your PATH. Otherwise, build the binaries yourself. The examples in this guide invoke them as firebolt and dedicated-pensieve. The firebolt binary bundles the server and a CLI; every mode here starts a server with firebolt server start. See engine arguments for --data-dir and --server-config, and engine configuration for the YAML file.

Build the binaries yourself

Build on Ubuntu 22.04 or newer.

Tool	Version	Install
Clang	18	`sudo apt-get install clang-18`
Ninja	any	`sudo apt-get install ninja-build`
CMake	3.20+	`sudo apt-get install cmake`
Rust	stable	`curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \| sh`

Clone the repository and its submodules, configure once, then build both targets:

git clone --recursive git@github.com:firebolt-db/packdb.git
cd packdb

# One-time: configure the build (creates the build/ directory)
cmake --preset dev

# Build the engine and the standalone metadata service
ninja -C build firebolt dedicated-pensieve

The binaries are written to build/programs/firebolt/firebolt and build/programs/dedicated-pensieve/dedicated-pensieve. Add both directories to your PATH (or substitute the full path in each command):

export PATH="$PWD/build/programs/firebolt:$PWD/build/programs/dedicated-pensieve:$PATH"

​Concepts

​Recommended topology

​Choose a deployment

​How the nodes communicate

​Get the binaries

​Build the binaries yourself

Concepts

Recommended topology

Choose a deployment

How the nodes communicate

Get the binaries

Build the binaries yourself