> ## Documentation Index
> Fetch the complete documentation index at: https://docs.firebolt.io/llms.txt
> Use this file to discover all available pages before exploring further.

> Reference material for NGRAM function

# NGRAM

This function takes an integer `n` and a text sequence, then splits the sequence into
overlapping contiguous subsequences of length `n`.

## Syntax

```sql theme={"theme":{"light":"css-variables","dark":"css-variables"}}
NGRAM( <n>, <text> )
```

## Parameters

| Parameter | Description                                      | Datatype  |
| :-------- | :----------------------------------------------- | :-------- |
| `<n>`     | An integer specifying the length of each n-gram. | `INTEGER` |
| `<text>`  | The text sequence to split into n-grams.         | `TEXT`    |

## Return Types

`ARRAY(TEXT)`

* If any of the inputs is nullable, the result type is `ARRAY(TEXT) NULL`.

## Behavior

The function splits the input text into overlapping contiguous subsequences of length `n`.

* If `n` is smaller than the size of the input text, an array containing the single value of the input text is returned.
* If `n` is smaller than 1, an error is thrown.
* If any input is `NULL`, the result is `NULL` regardless of the other input value.

## Errors

An error is thrown if `n` is smaller than 1.

## Respect/Ignore Nulls

Propagates nulls: If any input is `NULL`, the result is `NULL`.

## Examples

The following example generates 2-grams (bigrams) from the string 'hello world':

<div className="query-window">
  ```
  SELECT NGRAM(2, 'hello world') AS result;
  ```

  | result <span>array(text)</span>                               |
  | :------------------------------------------------------------ |
  | \['he', 'el', 'll', 'lo', 'o ', ' w', 'wo', 'or', 'rl', 'ld'] |

  <p><span>Rows: 1</span><span>Execution time: 5.56ms</span></p>
</div>

The following example generates 3-grams (trigrams) from the string 'hello world':

<div className="query-window">
  ```
  SELECT NGRAM(3, 'hello world') AS result;
  ```

  | result <span>array(text)</span>                                  |
  | :--------------------------------------------------------------- |
  | \['hel', 'ell', 'llo', 'lo ', 'o w', ' wo', 'wor', 'orl', 'rld'] |

  <p><span>Rows: 1</span><span>Execution time: 5.68ms</span></p>
</div>

The following example generates 1-grams (unigrams) from the string 'hello':

<div className="query-window">
  ```
  SELECT NGRAM(1, 'hello') AS result;
  ```

  | result <span>array(text)</span> |
  | :------------------------------ |
  | \['h', 'e', 'l', 'l', 'o']      |

  <p><span>Rows: 1</span><span>Execution time: 5.62ms</span></p>
</div>

The following example generates 10-grams from the string 'hi'. Since the string length matches the n-gram size, the result contains the entire string:

<div className="query-window">
  ```
  SELECT NGRAM(10, 'hi') AS result;
  ```

  | result <span>array(text)</span> |
  | :------------------------------ |
  | \['hi']                         |

  <p><span>Rows: 1</span><span>Execution time: 5.08ms</span></p>
</div>

The following example uses an n-gram size of 0, which is invalid and throws an error:

```sql theme={"theme":{"light":"css-variables","dark":"css-variables"}}
SELECT NGRAM(0, 'hi') AS result;
```

ERROR: Line 1, Column 8: Invalid n-gram size: 0. Must be greater than 0. Choose an n-gram size larger than 0 or NULL.

The following example uses a negative n-gram size, which is invalid and throws an error:

```sql theme={"theme":{"light":"css-variables","dark":"css-variables"}}
SELECT NGRAM(-1, 'hi') AS result;
```

ERROR: Line 1, Column 8: Invalid n-gram size: -1. Must be greater than 0. Choose an n-gram size larger than 0 or NULL.

The following example generates 2-grams (bigrams) from the Japanese string 'こんにちは':

<div className="query-window">
  ```
  SELECT NGRAM(2, 'こんにちは') AS result;
  ```

  | result <span>array(text)</span> |
  | :------------------------------ |
  | \['こん', 'んに', 'にち', 'ちは']       |

  <p><span>Rows: 1</span><span>Execution time: 5.10ms</span></p>
</div>

The following example generates 2-grams (bigrams) from the string of emojis '😊👍🎉':

<div className="query-window">
  ```
  SELECT NGRAM(2, '😊👍🎉') AS result;
  ```

  | result <span>array(text)</span> |
  | :------------------------------ |
  | \['😊👍', '👍🎉']               |

  <p><span>Rows: 1</span><span>Execution time: 5.69ms</span></p>
</div>
