Link Search Menu Expand Document

Working with partitions

Partitions are smaller physical parts of large fact tables. They provide the first layer of sorting when you use a fact table to ingest data. Data is sorted in storage by partition first, and then pruned and sorted by the primary index definition next. When new data is ingested into a fact table, Firebolt saves rows automatically in the appropriate partition.

When to use partitions

Partitions are particularly useful to simplify table maintenance by allowing you to drop partitions and delete rows in bulk. For example, consider a transaction table with an average of approximately 150,000 transactions a day, which you partition by month. At the end of each month, you can run ALTER TABLE…DROP PARTITION to delete the last month’s data, and then INSERT to update the fact table with the most recent month’s data.

Dropping a partition deletes all the records stored in the partition.

Considerations for partitioning and query performance

Although some applications may see a performance boost from partitions, Firebolt does not rely on partitioning for performance. Firebolt’s indexing features are enough for many applications.

We recommend that you benchmark your application with and without partitions.

Partition only large tables

We recommend that you consider partitioning only for fact tables greater than 100 million rows.

Large, equally distributed partitions work best

Too many small partitions to read increases I/O and decreases performance. In addition, skewed (asymmetrically distributed) partitions can lead to poor performance. Choose columns for partition keys that create large partitions of relatively equal size.

Defining partition keys

You define partitions using the PARTITION BY clause in a CREATE FACT TABLE statement.

Rows with the same value in the column key and whose function expression resolves to the same value are included in the partition.

Partition key arguments must not evaluate to NULL and can be any of the following.

  • Column names, as shown below.
    PARTITION BY date_column;
    
    PARTITION BY product_type;
    
  • The result of an EXTRACT function applied to a column of any of the date and time data types, as shown below.
    PARTITION BY EXTRACT(MONTH FROM date_column);
    
  • A composite key, with a mix of columns and EXTRACT functions, as shown below.
    PARTITION BY EXTRACT(MONTH FROM date_column), product_type;
    

Dropping partitions

Use the ALTER TABLE…DROP PARTITION statement to delete a partition and the data stored in that partition.

When you drop a partition created with a composite partition key, you must specify the full partition key. Dropping based on a subset of a composite key is not supported. See the example Partition and drop by composite key below.

ALTER TABLE <table_name> DROP PARTITION 12,34;

Dropping a partition deletes the partition and the data stored in that partition.

Examples

The examples in this section are based on the following common CREATE TABLE example. Each example is based on the addition of the PARTITION BY statement shown.

CREATE FACT TABLE fct_tbl_transactions
(
    transaction_id      BIGINT,
    transaction_date    DATE,
    store_id            INT,
    product_id          INT,
    units_sold          INT
)
PRIMARY INDEX store_id, product_id
<examples of PARTITION BY clauses below>

Partition and drop by date

The example below creates a partition for each group of records with the same date value in transaction_date.

PARTITION BY transaction_date

The example below drops the partition for records with the date 2020-01-01. The date is provided as a string literal and must be cast to the DATE data type in the command. The command uses the :: operator for CAST.

ALTER TABLE fct_tbl_transactions DROP PARTITION `2020-01-01`::DATE;

Partition and drop by date extraction

The example below uses EXTRACT to create a partition for each group of records with the same year value in transaction_date.

PARTITION BY EXTRACT(YEAR FROM transaction_date), EXTRACT(MONTH FROM transaction_date);

The example below drops the partition for records where transaction_date contains 3, which corresponds to the month of March. The month is specified as an integer in the command.

ALTER TABLE fct_tbl_transactions DROP PARTITION 2022,04;

Partition and drop by integer

The example below creates a partition for each group of records with the same value for product_id.

PARTITION BY product_id

The example below drops the partition where product_id contains 8188.

ALTER TABLE fct_tbl_transactions DROP PARTITION 8188;

Partition and drop by composite key

The example below creates a partition for each group of records where store_id contains the same value and transaction_date contains the same year.

PARTITION BY store_id,EXTRACT(YEAR FROM transaction_date);

The example below drops the partition where store_id contains 982 and transaction_date contains 2020 .

ALTER TABLE transactions DROP PARTITION 982,2020;