Describes the Firebolt implementation of the STRUCT
data type
STRUCT
data type is currently in public preview. We are gathering feedback and further refining this feature.STRUCT
data type.
STRUCT(...)
syntax:
NOT NULL
.
For example, defining the kv
column to be NOT NULL
is OK. Defining the s_key
or value
attribute to be NOT NULL
is not supported.
kv
cannot store NULL
values for the whole struct. But a struct value where both s_key
and value
attributes are NULL
is valid.
Although specifying the NOT NULL
constraint to a struct attribute yields an error, declaring a struct attribute as NULL
explicitly is accepted. It is redundant but may be used to improve readability.
COPY FROM
statement. For example:
SUPPORT_STRUCTS=TRUE
: Nested structures in the source data will be inferred as STRUCT
data types.SUPPORT_STRUCTS=FALSE
: Nested structures in the source data will be inferred as individual shredded columns.SUPPORT_STRUCTS
parameter only affects type inference during schema discovery. With predefined table schemas, the system maps incoming data to your schema regardless of this setting. For more details, see the COPY FROM
documentation.COPY FROM
statement supports the STRUCT
data type for Parquet files. Support of other file formats will be added in the near future.
Other utilities like READ_PARQUET
and external tables do not support the STRUCT
data type at the moment. We will extend the coverage in the following releases.
value
, without adding the struct name or any extra formatting to the column name.
In addition, the PostgreSQL syntax for accessing a field of a composite type is also supported:
t.a
can be resolved to either a column a
of table t
or an attribute a
of a struct t
, the Firebolt syntax t.a
prefers the table column over the struct attribute. On the other hand, the PostgreSQL syntax (t).a
prefers the struct attribute over the table column.
When there is no such ambiguity, both syntaxes resolve to the same object and they can be used interchangeably.
STRUCT
value cannot be converted to or from any other non-struct data type.
A value of a struct type S1
can be converted into another struct type S2
only if both of the following hold:
S1
and S2
have exactly the same structure. Attributes can have different names in S1
and S2
, but they must form a bijection.p
in S1
and its image attribute q
in S2
, there exists a type conversion from p
’s data type to q
’s data type.GROUP BY
and ORDER BY
clauses of a SELECT
statement:
ORDER BY kv
means ordering by "s_key"
attribute first, and then "value"
.
EXPLAIN
output with structsProjection
and Sort
operate on decomposed fields. These details are especially relevant when interpreting the EXPLAIN
output for queries on STRUCT
columns.
Let us take a look at the EXPLAIN
output of the last example query above:
StoredTable
node [2] shows “5/5 column(s)”. Recall that struct values are decomposed into individual attribute columns. So table t
is actaully decomposed into the following schema:
kv.s_key
and kv.value
, there is a third column kv.$not_null$
that denotes whether the whole struct value is NULL
. This column exists only if the struct is defined as nullable.
The Sort
node [1] directly orders tuples based on the decomposed struct attributes, including the $not_null$
marker column.
The top Projection
node [0]
constructs struct values for the kv
column, as required by the query. However, instead of packing these values into actual struct objects, Firebolt serializes the result directly using the list of attribute columns.
ARRAY
and STRUCT
composite types. They compose freely, so you can model data as arrays of structs, structs of arrays, or more complicated shapes.
STRUCT
type itself. This means nested STRUCT
types are allowed. For example:
STRUCT
type composes with the ARRAY
type as well. It is possible to define an array of struct:
STRUCT
or an array of STRUCT
. As a result, for an array of struct column, only the top-level array can be NOT NULL
. All nested element types are nullable by default.
At the moment, array functions and lambda functions generally do not support array of struct. The support will be extended in the near future.
Firebolt supports using array subscription ([]
syntax) to access an array element on an array of STRUCT
values.
arr[1].a
does not work at the moment. Use the PostgreSQL syntax instead: (arr[1]).a
. The issue will be fixed in the near future.UNNEST
.
ARRAY_AGG
can be useful when working on array of struct, together with UNNEST
.
s_key > 10
from an array of struct column and preserves the array structure.
Using UNNEST
and ARRAY_AGG
together provides a fundamental way of working with arrays of struct even if there is no array or lambda function available for the operation in need.
ARRAY_TRANSFORM
function.
ARRAY_TRANSFORM
can be used to extract an attribute from an array of structs while keeping the array structure. Once having an array of primitive types, all other array functions can be used to process the array further. In this example, ARRAY_FILTER
is used to select keys that are bigger than 10.