Get started with Firebolt
Welcome to the beginning of your journey with Firebolt! This tutorial guides you through setting up your Firebolt account, creating your database, and importing a sample dataset. Following that, we’ll execute some basic analytics queries to demonstrate the power and speed of Firebolt analytics.
For those seeking a more interactive learning experience, we invite you to join one of our comprehensive, instructor-led Firebolt workshop. These hands-on session are crafted to deepen your understanding of Firebolt’s capabilities and how to leverage them for your data analytics needs. Secure your spot here and elevate your Firebolt skills.
- Create a Firebolt Account
- Create Your First Database and Engine
- Execute Your First Query
- Add Data To Your Database
- Next Steps
Create a Firebolt Account
- Select ‘Get Started’ after completing the registration form at go.firebolt.io
- Select ‘Verify’ on the confirmation email you recieve. You should see a verified screen.
- Type in your email and password and click ‘Log In’
- Optionally, you can rename your account if you choose.
And that’s it! You’re ready to get started with Firebolt.
New accounts get 600 Firebolt credits ($200+) to get started exploring with Firebolt. Once you run out of credits, we recommend you connect Firebolt with your AWS Marketplace account and get back to making the most of Firebolt. See Registering though AWS Marketplace below.
Create Your First Database and Engine
Embarking on your data journey with Firebolt begins with creating a database and selecting an engine tailored to your specific workload. An engine in Firebolt provides the compute resources dedicated to a database for executing tasks (see Understanding Engines to learn more). By default, every database starts with one general purpose engine capable of writing to the Firebolt file format (F3) for both data ingestion and analytics queries. For more information on using Firebolt engines, see Working with engines.
Steps to Create Your Database:
- Click the + next to Databases
- Click Create new database.
- Enter the name for your database in the Database Name field. For the purposes of this guide, we’ll use “Tutorial_Database” as our database name.
Steps to Create Your Engine:
- Click the + next to Databases again.
- Click Create new engine.
- Enter the name of your engine in the New engine name field. For the purposes of this guide, we’ll use “Tutorial_Engine” as our engine name.
The default configuration is a small node, which is more than enough for this tutorial. To learn more about proper sizing of nodes for your workload, see [ADD DOC REFERENCE]
Execute Your First Query
Before we dive into ingesting sample data, let’s familiarize ourselves with the SQL workspace of your database. This initial step not only demonstrates how to activate your engine but also acquaints you with the process of executing queries within Firebolt. For a deeper dive into the SQL workspace and its capabilities, our Query Data guide is an excellent resource.
Steps to Access Your Database, Activate the Engine, and Execute Your First Query:
-
From the Database page, locate the database you’ve previously created (“Tutorial_Database”). Click the Open in SQL workspace icon (>_).
-
Upon entering the SQL workspace, you’ll be greeted by the “Script 1” tab. This is where you’ll craft your queries. To get started, enter the following simple query, designed to fetch a list of databases associated with your account:
SHOW DATABASES;
- Select Run Script to execute the query. You’ll notice the Using dropdown menu, which displays the engine Firebolt employs to run your script. In this instance, it will indicate “Tutorial_Engine”.
At this time, Firebolt may prompt you to initiate your engine if it’s not already active. Select Start Engine to start your engine. Engine startup typically requires a few moments to complete, preparing your environment for data analysis.
By following these steps, you’ve not only executed your first query but also successfully set the stage for more advanced data exploration. This is just the beginning of what’s possible with Firebolt - let’s continue by adding some data.
Add Data To Your Database
Now that you can select databases, select engines, and run queries, let’s add some sample data. For this tutorial we will explore adding data in two ways;
- Option 1: Use
COPY FROM
- Option 2: Creating and using an external table
Option 1: Use COPY FROM
COPY FROM
allows you to copy data directly to a Firebolt table. For more information, see COPY FROM.
Step 1: Use COPY FROM to ingest data
You can use the COPY FROM
command to copy the data from the S3 bucket durectly into Firebolt. During this operation, Firebolt ingests the data from your source into Firebolt.
- Choose the plus symbol (+) next to Script 1 to create a new script tab in the SQL workspace.
- Copy and paste the query below into the new tab.
COPY INTO Tutorial FROM 's3://firebolt-publishing-public/samples/green_taxi/';
You can specify a role or AWS key credentials with permission to read from the S3 location using ‘CREDENTIALS =’ but we are using a publicly accessible datset for this tutorial.
Step 2: Query the ingested data
To verify that you inserted the data into the table, run a simple SELECT
query like the one below.
SELECT
*
FROM
Tutorial
And that’s it! You’ve successfully copied data from S3 into a Firebolt table.
Option 2: Create and use an external table
Step 1: Create an External Table
An external table is a special, virtual table that serves as a connector to your data source. After the external table is created, you ingest data by running an INSERT
command from that external table into a fact table or dimension table. The INSERT
command must run on a general purpose engine. After the data is available in fact and dimension tables, you can run analytics queries over those tables using any engine.
Although it’s possible, we don’t recommend running analytics queries on external tables. For more information on working with external tables, see Work with external tables. FOr more information on fact tables, see Working with tables
Create Your External Table:
- Choose the plus symbol (+) next to script tab to create a new script tab in the SQL workspace.
- Copy and paste the query below into the new tab.
CREATE EXTERNAL TABLE IF NOT EXISTS ex_levels (
-- Column definitions map to data fields
-- in the source data file and are specified
-- as sources in the INSERT INTO statement for ingestion.
LevelID INTEGER,
GameID INTEGER,
Level INTEGER,
Name TEXT,
LevelType TEXT,
NextLevel INTEGER NULL,
MinPointsToPass INTEGER,
MaxPoints INTEGER,
NumberOfLaps INTEGER,
MaxPlayers INTEGER,
MinPlayers INTEGER,
PointsPerLap REAL,
MusicTrack TEXT,
SceneDetails TEXT,
MaxPlayTimeSeconds INTEGER,
LevelIcon TEXT
)
URL = 's3://firebolt-publishing-public/help_center_assets/firebolt_sample_dataset/'
-- CREDENTIALS = ( AWS_KEY_ID = '******' AWS_SECRET_KEY = '******' )
OBJECT_PATTERN = 'help_center_assets/firebolt_sample_dataset/levels.csv'
TYPE = (CSV SKIP_HEADER_ROWS = 1);
‘URL =’ specifies the data source in S3. All files in the location that match the ‘OBJECT_PATTERN’ and will be processed during ingestion. ‘CREDENTIALS =’ specify a role or AWS key credentials with permission to read from the S3 location. These credentials are commented out for this tutorial because the bucket is publicly accessible.
- Click Run. When completed the external table
ex_levels
appears on the object panel of the database.
- [Optional] Choose the vertical ellipses next to script, choose Rename script, enter a name (for example, MyExTableScript) and then press ENTER to update the name.
Step 2: Create a Fact Table
In this step, you’ll create a Firebolt fact table called levels
, which you use in the next step as the target for an INSERT INTO
command.
When creating a fact or dimension table, you will specify a primary index. Firebolt uses the primary index when it ingests data so that it is saved to S3 for highly efficient pruning and sorting when the data is queried. A primary index is required when creating a fact table, and recommended for dimension tables. For more information, see Using primary indexes.
The fact table that we create in this step specifies the LevelID
column for the primary index. For more information about choosing columns for a primary index, see How to choose primary index columns.
To create a fact table
-
Create a new script tab.
- Copy and paste the query below into the script tab.
CREATE FACT TABLE IF NOT EXISTS levels ( LevelID INTEGER, GameID INTEGER, Level INTEGER, Name TEXT, LevelType TEXT, NextLevel INTEGER NULL, MinPointsToPass INTEGER, MaxPoints INTEGER, NumberOfLaps INTEGER, MaxPlayers INTEGER, MinPlayers INTEGER, PointsPerLap REAL, MusicTrack TEXT, SceneDetails TEXT, MaxPlayTimeSeconds INTEGER, LevelIcon TEXT, SOURCE_FILE_NAME TEXT, SOURCE_FILE_TIMESTAMP TIMESTAMP ) PRIMARY INDEX LevelID;
- Click Run Script. When finished, the table
levels
appears on the object panel of the database.
Step 3: Use INSERT INTO to ingest data
You can now use the INSERT INTO
command to copy the data from the external table into the fact table. During this operation, Firebolt ingests the data from your source into Firebolt.
To run an INSERT
command that ingests data
- Create a new script tab.
- Copy and paste the query below into the script tab.
INSERT INTO levels SELECT LevelID, GameID, Level, Name, LevelType, NextLevel, MinPointsToPass, MaxPoints, NumberOfLaps, MaxPlayers, MinPlayers, PointsPerLap, MusicTrack, SceneDetails, MaxPlayTimeSeconds, LevelIcon, $SOURCE_FILE_NAME, $SOURCE_FILE_TIMESTAMP FROM ex_levels;
- Choose Run Script Click Run to execut the script.
The query results pane indicates a Status of Running when the query is executing. The Status will change to Success when the ingestion is complete as shown below.
Step 4: Query the ingested data
Now that the data has been ingested into the levels
table, you can run analytics queries over the table that benefit from the speed and efficiency of Firebolt.
To verify that you inserted the data into the table, run a simple SELECT
query like the one below.
SELECT
*
FROM
levels
The values shown in the query results pane should be similar to those shown below.
Next Steps
Now that you have successfully created your first engine and database in Firebolt, run your first query, and copied data to Firebolt you can start exploring what else Firebotl has to offer! Below are a few examples to get you started.
Register through AWS Marketplace
This registration is a prerequisite for starting engines and running queries after your initial trial credits.
To register
-
On the Firebolt page, navigate to the Configuration menu. Click Billing.
-
Click Connect to AWS Marketplace. This will take you to the Firebolt page available on AWS Marketplace.
-
On the AWS Marketplace page, click the View Purchase Options on the top right hand corner of the screen.
-
Click Setup Your Account.
Your account should now be associated with AWS Marketplace.
For guidance on configuring AWS roles for seamless access to your S3 data, see our detailed walkthrough on Using AWS Roles for S3 Access.
Configure an aggregating index
An aggregating index enables you to take a subset of table columns and predefine dimensions and measures to aggregate. Many aggregations are supported—from SUM
, MAX
, and MIN
to more complex aggregations such as COUNT
and COUNT(DISTINCT)
. At query runtime, instead of calculating the aggregation on the entire table and scanning all rows, Firebolt uses the pre-calculated values in the aggregating index. For more information, see Aggregating indexes.
From the levels
fact table that you created in the previous step, assume you want to run a query to look at the AVG(NumberOfLaps)
, grouped by LevelType
. You can create an aggregating index to speed up these queries by running the statement below.
CREATE AGGREGATING INDEX
levels_agg_idx
ON levels (
LevelType
, AVG(NumberOfLaps)
);
After you run the script, you see the levels_agg_idx
index listed in the object pane. Any queries that run over the levels
table that combine any of these fields and aggregations defined in the index will now use the index instead of reading the entire table.
Use COPY TO to export data to S3
The example below shows a COPY TO
statement with minimal parameters that specifies an AWS_ROLE_ARN
. Because TYPE
is omitted, the file or files will be written in CSV format, and because COMPRESSION
is omitted, they are compressed using GZIP (*.csv.gz
).
COPY (SELECT * FROM test_table)
TO 's3://my_bucket/my_fb_queries'
CREDENTIALS = (AWS_ROLE_ARN='arn:aws:iam::123456789012:role/my-firebolt-role');
Firebolt assigns the query ID 16B903C4206098FD
to the query at runtime. The compressed output is 40 MB, exceeding the default of 16 MB, so Firebolt writes 4 files as shown below.
s3://my_bucket/my_fb_queries/
16B903C4206098FD_0.csv.gz
16B903C4206098FD_1.csv.gz
16B903C4206098FD_2.csv.gz
16B903C4206098FD_3.csv.gz
See COPY TO for more information.