Skip to main content
Firebolt Core is all about high-throughput, low-latency data processing. As outlined in Connect, all communication with Firebolt Core ultimately occurs through SQL and data processing is no exception. Firebolt Core generally supports the full SQL dialect documented in our SQL reference, except a few exceptions that concern features specific to our managed Cloud data warehouse (e.g. RBAC or engine management commands). A complete list of differences between Firebolt Core and the managed Firebolt Cloud data warehouse can be found in Differences between Firebolt Core and managed Firebolt. The remainder of this page focuses specifically on importing, managing, and exporting data in Firebolt Core.

Importing External Data

External data can be imported into Firebolt Core from different object storage providers. In the default configuration, you can import data from Amazon S3 and Google Cloud Storage. In addition, Firebolt Core can work with any S3-compatible object store like MinIO or Cloudflare R2. To enable this, you can use the default_s3_endpoint_override config property in the Firebolt Core Configuration File.
  • Raw data files in many different formats can be imported directly from object storage.
    • The easiest way to access such data is to import it into a SQL table using COPY FROM. COPY FROM supports many convenience features such as schema discovery or metadata filtering, and can easily adapt to different data loading workflows.
    • Alternatively, you can also create an external table encompassing all relevant data files. This has the advantage that no data is persistently stored and thus duplicated on the Firebolt Core cluster itself, but fewer convenience features are available for external tables than for COPY FROM.
    • Data files can also be read directly with the read_parquet(..) or read_csv(..) table-valued functions.
  • Apache Iceberg tables can be read through the read_iceberg(..) table-valued function.
    • We currently support a subset of Iceberg catalogs, including file-based catalogs, the Databricks Unity catalog, the AWS Glue catalog, or any other catalog that implements the Iceberg REST Catalog API.
    • The data files of the Iceberg table may be stored in any of the supported object stores.
Note that data stored on Google Cloud Storage can currently only be accessed through the S3 interoperability layer exposed by GCS. In order to access such data from Firebolt Core, you will need to navigate to the “Access keys for your user account” section of the Interoperability tab in your Cloud Storage settings and generate an access key & secret for your account there. These will then need to be specified as the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY parameters of the respective SQL command or function (e.g. CREATE LOCATION, CREATE EXTERNAL TABLE, or read_iceberg(..)).

Managing Metadata & Data

In addition to processing external data, Firebolt Core can also manage relational data itself. Most DDL and all DML commands are supported in Firebolt Core for this purpose. It is important to note, however, that Firebolt Core provides no compute-storage isolation. In other words, data managed by one Firebolt Core cluster cannot be shared with any other Firebolt Core cluster. Furthermore, tables are sharded across all nodes, which means that a Firebolt Core cluster containing such tables cannot be resized to a different number of nodes (see also Differences between Firebolt Core and managed Firebolt). Please refer to Deployment and Operational Guide for further details about setting up persistent storage for the managed data used in Firebolt Core nodes.

Exporting Data

Data can be exported from Firebolt Core through the following means.
  • COPY TO writes raw data files to Amazon S3 or Google Cloud Storage.
  • Alternatively, you can also process query results within your client application (see Connect). If the only goal is to persist query results to raw data files (e.g. in an ETL pipeline), doing this in the client will generally be slower than using COPY TO due to the added cost of serializing and transferring data to the client.
Analogously to reading data, writing data to Google Cloud Storage currently goes through the S3 interoperability layer exposed by GCS and requires a suitable access key & secret (see above for details).

Examples

The Firebolt Core GitHub repository contains some examples for the different ways to ingest and export data in Firebolt Core.