CREATE TABLE
or INSERT
which modify the persistent database state associated with a cluster. More specifically, this persistent database state consists of the following two complementary types of data that are managed by a Firebolt Core cluster.
/firebolt-core/volume/persistent_data/metadata_store
directory.
The contents of schema objects (2) are sharded transparently across all nodes of a Firebolt Core cluster. When a query accesses such a schema object, the query execution engine automatically generates a distributed query plan which ensures that all of the corresponding shards are correctly read and processed. When writing to such a schema object, the query execution engine does not actively distribute data across nodes. For example, this means that a plain INSERT
statement into a table will usually only insert data to the shard stored on the node to which the query was sent. Users should make sure to route such statements to different nodes in order to ensure an even data distribution (see also Connect).
Since the contents of schema objects are sharded across nodes, it is generally not possible to change the number of nodes within a Firebolt Core cluster. Attempting to remove a node would result in all shards of data stored on that node to be lost, leading to incorrect query results. Attempting to add a node would result in data skew since there will initially be no shards stored on the added node. Note that this restriction does not apply if the workload running on a Firebolt Core cluster does not involve any database objects that are stored locally on the Firebolt Core nodes. For example, a workload that only operates on external tables or Iceberg tables only persists the database catalog on node 0, but no sharded data on the remaining nodes.
By default, all of the above database state would be written to the writeable container layer of the Firebolt Core Docker containers, which means that it is tied to a specific Docker container that cannot easily be migrated between hosts. In the supported Docker Compose and Kubernetes deployments volumes are mounted to ease preservation of the state; see Deployment and Operational Guide for further details.