State Management
Firebolt Core supports DDL and DML commands likeCREATE TABLE or INSERT which modify the persistent database state associated with a cluster. More specifically, this persistent database state consists of the following two complementary types of data that are managed by a Firebolt Core cluster.
- The database catalog containing information about schema objects like databases, namespaces, tables, views, indexes, etc.
- The contents of these schema objects such as the actual data inserted into a table.
/firebolt-core/volume/persistent_data/metadata_store directory.
The contents of schema objects (2) are sharded transparently across all nodes of a Firebolt Core cluster. When a query accesses such a schema object, the query execution engine automatically generates a distributed query plan which ensures that all of the corresponding shards are correctly read and processed. When writing to such a schema object, the query execution engine does not actively distribute data across nodes. For example, this means that a plain INSERT statement into a table will usually only insert data to the shard stored on the node to which the query was sent. Users should make sure to route such statements to different nodes in order to ensure an even data distribution (see also Connect).
Since the contents of schema objects are sharded across nodes, changing the number of nodes requires bringing up a new cluster with the updated topology rather than hot-resizing an existing one. When object storage is configured, data is not lost in this process — the new cluster rehydrates from object storage. Without object storage, the shards stored on any removed node are lost permanently. Adding a node results in data skew since there will initially be no shards stored on the new node. Note that this restriction does not apply if the workload does not involve any locally managed tables — for example, a workload that only operates on external or Iceberg tables only persists the database catalog on node 0, with no sharded data on the remaining nodes.
By default, all of the above database state would be written to the writeable container layer of the Firebolt Core Docker containers, which means that it is tied to a specific Docker container that cannot easily be migrated between hosts. In the supported Docker Compose and Kubernetes deployments volumes are mounted to ease preservation of the state; see Deployment and Operational Guide for further details.