Asynchronous queries

An asynchronous query runs in the background and returns a successful response once it is accepted by the computing cluster, so that a client can proceed with other tasks without waiting for the statement to finish. The status of an asynchronous query can be checked at specified intervals, which provides flexibility, so that you can check the query’s status at meaningful times based on the expected duration of the operation. For example, a user can avoid unnecessary resource consumption by only checking the status periodically, rather than maintaining an open connection for the entire duration of the query, which might be unreliable or unnecessary for certain tasks.

Asynchronous queries are ideal for long-running SQL statements, such as INSERT, STOP ENGINE, and ALTER ENGINE, where keeping an HTTP connection open is both unreliable and unnecessary, and where the statement might return zero rows. In addition, tracking them can be challenging. Using an asynchronous query allows you to check the status of operations at intervals, based on the expected duration.

You should use asynchronous queries for any supported operation that may take more than a few minutes for which there are no results.

Supported asynchronous queries

  • INSERT – Inserts one or more values into a specified table.
  • Engine commands including ALTER ENGINE, STOP ENGINE, and START ENGINE. By default, Firebolt engines finish running queries before returning results, which can take significant time. Starting an engine can also take more than a few minutes.

How to submit an asynchronous query

You can only submit a synchronous query programmatically using the Firebolt API or the following listed drivers. Every SQL statement submitted using the Firebolt Develop Space user interface is a synchronous query.

The following are required prerequisites to submit a query programmatically:

  1. A Firebolt account – Ensure that you have access to an active Firebolt account. If you don’t have access, you can sign up for an account. For more information about how to register with Firebolt, see Get started with Firebolt.
  2. A Firebolt service account – You must have access to an active Firebolt service account, which facilitates programmatic access to Firebolt.
  3. A user associated with the Firebolt service account – You must associate a user with your service account, and the user must have the necessary permissions to run the query on the specified database using the specified engine.
  4. Sufficient permissions If you want to query user data through a specific engine, you must have sufficient permissions on the engine, as well as on any tables and databases you access.

To submit an asynchronous query via a raw HTTP request, you must use Firebolt protocol version 2.3 or later, while query status can be checked with any client. You can verify the protocol version by checking the X-Firebolt-Protocol-Version header in API response.

Use a Firebolt Driver

Use a Firebolt driver to connect to a Firebolt database, authenticate securely, and run SQL statements with minimal setup. The driver provides built-in methods for running SQL statements, handling responses, and managing connections. Only some Firebolt drivers support synchronous queries. See the documentation for each driver for specific details on how to submit asynchronous queries programmatically:

Submit a query

Submitting a query through a Firebolt drivers and SDKs have similar formats. The following code example shows how to submit an asynchronous query using the Python SDK. For other languages, consult the specific driver for details:

The following code example establishes a connection to a Firebolt database using a service account, submits an asynchronous INSERT statement that groups generated numbers, periodically checks its run status, and then retrieves the row count from the example table:

from time import sleep

from firebolt.db import connect
from firebolt.client.auth import ClientCredentials

id = "service_account_id"
secret = "service_account_secret"
engine_name = "your_engine_name"
database_name = "your_test_db"
account_name = "your_account_name"

query = """
    INSERT INTO example SELECT idMod7 as id
    FROM (
        SELECT id%7 as idMod7
        FROM GENERATE_SERIES(1, 10000000000) s(id)
    )
    GROUP BY idMod7;
    """

with connect(
    engine_name=engine_name,
    database=database_name,
    account_name=account_name,
    auth=ClientCredentials(id, secret),
) as connection:
    cursor = connection.cursor()

    cursor.execute_async(query) # Needs firebolt-sdk 1.9.0 or later
    # Token lets us check the status of the query later
    token = cursor.async_query_token
    print(f"Query Token: {token}")

    # Block until the query is done
    # You can also do other work here
    while connection.is_async_query_running(token):
        print("Checking query status...")
        sleep(5)

    status = "Success" if connection.is_async_query_successful(token) else "Failed"
    print(f"Query Status: {status}")

    cursor.execute("SELECT count(*) FROM example;")  # Should contain 7 rows
    for row in cursor.fetchall():
        print(row)

Check query statusy

The query status token is included in the initial response when the query is submitted. If needed, you can also retrieve the token from the engine_running_queries view.

To check the status of an asynchronous query, use the token with the CALL fb_GetAsyncStatus function as follows:

CALL fb_GetAsyncStatus('<async_token>');

The previous code example returns a single row with the following schema:

Column Name Data Type Description
account_name TEXT The name of the account where the asynchronous query was submitted.
user_name TEXT The name of the user who submitted the asynchronous query.
request_id TEXT Unique ID of the request which submitted the asynchronous query.
query_id TEXT Unique ID of the asynchronous query.
status TEXT Current status of the query: SUSPENDED, RUNNING, CANCELLED, FAILED, SUCCEEDED or IN_DOUBT.
submitted_time TIMESTAMPTZ The time the asynchronous query was submitted.
start_time TIMESTAMPTZ The time the async query was most recently started.
end_time TIMESTAMPTZ If the asynchronous query is completed, the time it finished.
error_message TEXT If the asynchronous query failed, the error message from the failure.
retries LONG The number of times the asynchronous query has retried.
scanned_bytes LONG The number of bytes scanned by the asynchronous query.
scanned_rows LONG The number of rows scanned by the asynchronous query.

Cancel a query

A running asynchronous query can be cancelled using the CANCEL statement as follows:

CANCEL QUERY '<query_id>';

In the previous code example, retrieve the query ID from the engine_running_queries view or from the original query submission response.

Error handling

Error Type Cause Solution
Protocol version mismatch Using an outdated Firebolt protocol version. Make sure your driver supports async queries.
Query failure The query encounters an execution error. Check the error message in fb_GetAsyncStatus and validate the query syntax.
Token not found The provided async query token is invalid or expired. Verify that the correct token is being used and that the query has not expired.
Engine does not exist or you don’t have permission to access it The specified Firebolt engine is not running or you don’t have permission to access it. Start the engine before submitting the query and double check permissions.