Learn how to use the Apache Airflow provider package to connect Airflow to Firebolt.
1.0.0
with the desired version:
airflow-provider-firebolt
to the requirements.txt
file following the instructions in the MWAA Documentation.
Parameter | Description | Example value |
---|---|---|
Connection id | The name of the connection for the UI. | My_Firebolt_Connection |
Description | Information about the connection. | Connection to Firebolt database MyDatabase using engine MyFireboltDatabase_general_purpose. |
Database | The name of the Firebolt database to connect to. | MyFireboltDatabase |
Engine | The name of the engine to run queries | MyFireboltEngine |
Client ID | The ID of your service account. | XyZ83JSuhsua82hs |
Client Secret | The secret for your service account authentication. | yy7h&993))29&%j |
Account | The name of your account. | developer |
Extra | The additional properties that you may need to set (optional). | {"property1": "value1", "property2": "value2"} |
AUTO_STOP
configuredfirebolt_provider_trip_data
. It uses an Airflow connection to Firebolt named my_firebolt_connection
. For the contents of the SQL scripts that the DAG runs, see the following SQL script examples. You can run this example with your own database and engine by updating the connector values in Airfow, setting the FIREBOLT_CONN_ID
to match your connector, and creating the necessary custom variables in Airflow.
firebolt_sql_path
to define the directory within your Airflow home directory where SQL files are stored. The DAG reads these files to execute tasks in Firebolt.
firebolt_sql_path
~/airflow/sql_store
.firebolt_sql_path
. This allows the DAG to dynamically execute the SQL files as tasks in Firebolt.
The following example demonstrates how the variable is accessed in the DAG script:
tmpl_search_path
.
ex_trip_data
fact table to connect to a public Amazon S3 data store.
my_taxi_trip_data
fact table, to receive ingested data.
INSERT INTO
operation ingests data into the my_taxi_trip_data
fact table using the ex_trip_data
external table. This example uses the external table metadata column, $source_file_timestamp
, to retrieve records exclusively from the latest file.
FireboltOperator
is designed to execute SQL queries but does not return query results. To retrieve query results, use the FireboltHook
class. The following example demonstrates how to use FireboltHook
to execute a query and log the row count in the my_taxi_trip_data
table.
query_timeout
: Sets the maximum duration (in seconds) that a query can runfail_on_query_timeout
- If True
, a timeout raises a QueryTimeoutError
. If False
, the task terminates quietly, and the task proceeds without raising an error.FireboltOperator
task stops execution after one second and proceeds without error. The PythonOperator
task fetches data from Firebolt with a timeout of 0.5 seconds and raises an error if the query times out.