> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firebolt.io/llms.txt
> Use this file to discover all available pages before exploring further.

> Reference material for NGRAM function

# NGRAM

This function takes an integer `n` and a text sequence, then splits the sequence into
overlapping contiguous subsequences of length `n`.

## Syntax

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
NGRAM( <n>, <text> )
```

## Parameters

| Parameter | Description                                      | Datatype  |
| :-------- | :----------------------------------------------- | :-------- |
| `<n>`     | An integer specifying the length of each n-gram. | `INTEGER` |
| `<text>`  | The text sequence to split into n-grams.         | `TEXT`    |

## Return Types

`ARRAY(TEXT)`

* If any of the inputs is nullable, the result type is `ARRAY(TEXT) NULL`.

## Behavior

The function splits the input text into overlapping contiguous subsequences of length `n`.

* If `n` is smaller than the size of the input text, an array containing the single value of the input text is returned.
* If `n` is smaller than 1, an error is thrown.
* If any input is `NULL`, the result is `NULL` regardless of the other input value.

## Errors

An error is thrown if `n` is smaller than 1.

## Respect/Ignore Nulls

Propagates nulls: If any input is `NULL`, the result is `NULL`.

## Examples

The following example generates 2-grams (bigrams) from the string 'hello world':

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(2, 'hello world') AS result;
```

| result (ARRAY(TEXT))                  |
| :------------------------------------ |
| `{he,el,ll,lo,"o "," w",wo,or,rl,ld}` |

The following example generates 3-grams (trigrams) from the string 'hello world':

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(3, 'hello world') AS result;
```

| result (ARRAY(TEXT))                          |
| :-------------------------------------------- |
| `{hel,ell,llo,"lo ","o w"," wo",wor,orl,rld}` |

The following example generates 1-grams (unigrams) from the string 'hello':

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(1, 'hello') AS result;
```

| result (ARRAY(TEXT)) |
| :------------------- |
| `{h,e,l,l,o}`        |

The following example generates 10-grams from the string 'hi'. Since the string length matches the n-gram size, the result contains the entire string:

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(10, 'hi') AS result;
```

| result (ARRAY(TEXT)) |
| :------------------- |
| `{hi}`               |

The following example uses an n-gram size of 0, which is invalid and throws an error:

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(0, 'hi') AS result;
```

ERROR: Line 1, Column 8: Invalid n-gram size: 0. Must be greater than 0. Choose an n-gram size larger than 0 or NULL.

The following example uses a negative n-gram size, which is invalid and throws an error:

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(-1, 'hi') AS result;
```

ERROR: Line 1, Column 8: Invalid n-gram size: -1. Must be greater than 0. Choose an n-gram size larger than 0 or NULL.

The following example generates 2-grams (bigrams) from the Japanese string 'こんにちは':

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(2, 'こんにちは') AS result;
```

| result (ARRAY(TEXT)) |
| :------------------- |
| `{こん,んに,にち,ちは}`      |

The following example generates 2-grams (bigrams) from the string of emojis '😊👍🎉':

```sql theme={"theme":{"light":"github-light","dark":"github-dark"}}
SELECT NGRAM(2, '😊👍🎉') AS result;
```

| result (ARRAY(TEXT)) |
| :------------------- |
| `{😊👍,👍🎉}`        |
