RECOMMEND DDL
CALL recommend_ddl
can help you optimize schema configurations to enhance query performance by early data pruning. The statement finds primary indexes (PIs) and partition recommendations for the specified table tailored to the given workload.
Syntax
Parameters
Parameter | Description |
---|---|
<table_name> | The name of the table for which primary indexes and partition keys should be recommended. |
<select_statement> | SELECT statement that returns the workload that the DDL recommendation is based on. The <select_statement> must return exactly one column of type TEXT . |
Example
The example below demonstrates retrieving schema recommendation using the CALL recommend_ddl
statement for a table named lineitem
tailored to the workload in the query history of the past week.
The <select_statement>
returns exactly one column of type TEXT
containing the SQL statements that the CALL recommend_ddl
command should analyze.
Returns:
recommended_partition_key | recommended_primary_index | average_pruning_improvement | analyzed_queries |
---|---|---|---|
DATE_TRUNC(‘month’, l_orderdate) | l_shipmode, l_returnflag, l_shipinstruct | 0.42 | 393 |
The CALL recommend_ddl
results indicate that the amount of bytes scanned can be decreased by up to 42% by configuring PRIMARY INDEX l_shipmode, l_returnflag, l_shipinstruct
and PARTITION BY DATE_TRUNC('month', l_orderdate)
.
The statement analyzed 393 queries that scanned the lineitem
table and applied filters to any of the lineitem
columns.
Quick Setup
The following steps will guide you to achieve great query performance within the first few minutes after joining Firebolt. First, create a table without any primary index and parition key configurations.
Next, load a workload that you want to run on this table from S3 into Firebolt utilizing COPY INTO.
Now you can use the CALL recommend_ddl
command to find primary index and parition key configurations.
Finally, recreate the table with the recommended primary index and parition key configurations and ingest the data into this table.
Under The Hood
Primary index and partition key configurations are chosen to maximize pruning potential and thus reduce query runtime time. Columns with selective filters and low cardinality are suggested as primary index and partition key columns. Recommendations can be run on empty tables as well as on tables with production data. The more queries executed on a populated table, the better the recommendations become. If additionally the workload of a table changes over time, it can be beneficial to run the CALL recommend_ddl
command periodically to check for better table configurations.