Skip to main content
For production, run Firebolt on Kubernetes with the Firebolt Operator or the Helm chart. They handle node startup, health checks, upgrades, and metadata for you. Run the raw binaries only when you cannot use Kubernetes or Docker, or for local testing.
This guide runs Firebolt directly from binaries, with no Docker and no Kubernetes:
  • firebolt, the database binary that executes queries. It bundles the planner, runtime, and storage engine.
  • dedicated-pensieve, an optional standalone metadata service, for running multiple independent Firebolt clusters with decoupled metadata.

Concepts

An Engine is a cluster of one or more nodes that execute queries together. Every node runs the same firebolt binary and is given its position with --node. A query submitted to any node is planned there and its stages are distributed across that Engine’s nodes. These pages assume one node per machine: one firebolt process on its own host, reached at that host’s address. That is the only model a real deployment should use. You can also place several nodes on a single machine, but only for testing, and it needs extra care to avoid port collisions. That case is covered separately in Colocate multiple nodes on one host. Table data lives in object storage (S3, GCS, or Azure Blob Storage) as immutable tablets. Nodes read tablets directly from object storage and cache them on local SSD, so adding a node adds compute without moving data. The directory you pass with --data-dir holds only this cache and the node’s configuration; object storage is the single source of truth. Metadata, the catalog of tables, columns, and tablet locations, is served in one of two modes, set by instance.type in the engine configuration:
  • Embedded metadata (the default): the Engine hosts its own metadata service, backed by a local SQLite database. No separate process and no Postgres. The metadata belongs to that one Engine and cannot be shared. This is the simplest deployment.
  • Standalone metadata: the Engine connects to a separate dedicated-pensieve process backed by Postgres. Because the metadata lives outside any single Engine, several Engines can share one catalog and one bucket, and each reads the latest snapshot written by the others. This is the basis of workload isolation: two Engines operate on the same tablets without drawing compute from each other.
Run one firebolt node per machine. Every node reads and writes the same object storage bucket. The metadata mode decides whether there is a separate process.
    Machine A           Machine B           Machine C
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   firebolt    │   │   firebolt    │   │   firebolt    │
│   (node 0)    │   │   (node 1)    │   │   (node 2)    │
│  + metadata   │   │               │   │               │
└───────┬───────┘   └───────┬───────┘   └───────┬───────┘
        │                   │                   │
        └───────────────────┼───────────────────┘

                ┌───────────────────────┐
                │    Object storage     │
                │   (S3 / GCS / ...)    │
                └───────────────────────┘
Node 0 hosts the embedded metadata in a local SQLite database. No separate process and no Postgres.

Choose a deployment

How the nodes communicate

A node serves clients over HTTP, exchanges work with its peers over two channels, reaches the metadata service over gRPC, and reads and writes tablets directly to object storage. Each port below is per node. The engine binds them on all interfaces (0.0.0.0) by default, so with one node per machine the defaults never collide and you do not have to change them.
PortDefaultProtocolDirectionPurpose
--http-port3473HTTPClient to nodeSubmit SQL and read results. /ping returns Ok.
aragog_port5678gRPCNode to nodeDistributed execution control: schedule, cancel, and discard query stages
shufflepuff_port16000TCPNode to nodeData exchange (shuffle) between stages of a distributed query
storage_manager_port1717gRPCNode to nodeCluster storage coordination (tablet assignment, statistics). Bound only by the leader
storage_agent_port3434gRPCNode to nodePer-node storage and cache agent
health_check_port8122HTTPLocalLiveness and readiness probes. Not part of query execution
prometheus_port9090HTTPLocalMetrics scrape endpoint. Not part of query execution
The metadata service uses one more port, depending on the mode:
PortDefaultProtocolDirectionPurpose
Embedded metadata6500gRPCNode to nodeIn embedded mode node 0 hosts it; other nodes connect to <node 0 host>:6500
Standalone metadata7000gRPCNode to serviceIn standalone mode every node connects to the dedicated-pensieve process
Postgres5432PostgresService to databasededicated-pensieve stores metadata here
Across machines, the firewall must allow the node-to-node ports between Engine nodes, the metadata port from every node to the metadata host, and Postgres from the standalone metadata host to its database. Use the address each node’s peers reach it on, not localhost. All nodes of one Engine must start concurrently. Each node’s readiness check runs a distributed query that needs its peers reachable, so starting one node and waiting for it before starting the next deadlocks. Start every node of an Engine, then poll each node’s /ping until all return Ok.. Stop a node by sending SIGTERM to the process ID that firebolt server start prints (or that you captured when launching it in the background).

Get the binaries

Download the prebuilt binaries for your platform and put them on your PATH:
  • firebolt, the engine.
  • dedicated-pensieve, the standalone metadata service (only needed for standalone metadata).
Latest prebuilt binaries (from the firebolt-db/packdb releases):
PlatformEngine (firebolt)Standalone metadata (dedicated-pensieve)
Linux x86-64 (amd64)firebolt-core-amd64.tar.gzdedicated-pensieve-amd64.tar.gz
Linux ARM64 (aarch64)firebolt-core-arm64.tar.gzdedicated-pensieve-arm64.tar.gz
Extract each archive and move the binary onto your PATH (the engine archive unpacks to a firebolt-core-<arch>/ directory containing firebolt). Or build the binaries yourself. The examples in this guide invoke them as firebolt and dedicated-pensieve. The firebolt binary bundles the server and a CLI; every mode here starts a server with firebolt server start. See engine arguments for --data-dir and --server-config, and engine configuration for the YAML file.

Build the binaries yourself

Build on Ubuntu 22.04 or newer.
ToolVersionInstall
Clang18sudo apt-get install clang-18
Ninjaanysudo apt-get install ninja-build
CMake3.20+sudo apt-get install cmake
Ruststablecurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Clone the repository and its submodules, configure once, then build both targets:
git clone --recursive git@github.com:firebolt-db/packdb.git
cd packdb

# One-time: configure the build (creates the build/ directory)
cmake --preset dev

# Build the engine and the standalone metadata service
ninja -C build firebolt dedicated-pensieve
The binaries are written to build/programs/firebolt/firebolt and build/programs/dedicated-pensieve/dedicated-pensieve. Add both directories to your PATH (or substitute the full path in each command):
export PATH="$PWD/build/programs/firebolt:$PWD/build/programs/dedicated-pensieve:$PATH"