Skip to main content
This page explains how to install the Firebolt Operator in a Kubernetes cluster.

Prerequisites

  • Kubernetes 1.28 or later. The CRDs use CEL transition rules for field immutability.
  • Helm installed on the machine or automation that deploys the Firebolt Operator.
  • Access to the OCI Helm chart registry that hosts the Firebolt Operator charts.
  • Engine nodes that provide a locked-memory (memlock) limit of at least 8 GiB. See Engine node requirements.

Engine node requirements

When an engine starts, it locks memory for io_uring, so it needs a memlock limit of at least 8 GiB. Each engine pod takes that limit from containerd, the container runtime on its node. If the limit is too low, the engine crashes at startup with this error:
io_uring_register_buffers failed: Cannot allocate memory
The Firebolt Operator cannot raise this limit for you, so every engine node must provide it. Before you change anything, check the current limit on one of your engine nodes:
systemctl show containerd --property=LimitMEMLOCK
This prints either infinity or a value in bytes. Anything that is infinity or at least 8 GiB (8589934592 bytes) is enough, and that node needs no further change. Some node images already provide a high enough limit, while others default to around 8 MiB, which is far too low. If the limit is too low, raise memlock on the container runtime (containerd) with a systemd drop-in:
# /etc/systemd/system/containerd.service.d/memlock.conf
[Service]
LimitMEMLOCK=infinity
infinity is simplest. A bounded value such as LimitMEMLOCK=8G also works, as long as it is at least 8 GiB. Reload systemd, restart containerd, and confirm the limit:
systemctl daemon-reload
systemctl restart containerd
systemctl show containerd --property=LimitMEMLOCK
If you need to set it, bake the change into your node configuration so that upgraded, rotated, and autoscaled nodes get it automatically. How you deliver it depends on your cloud provider.

Amazon EKS

If you are using the Amazon EKS AMI, the memlock limit is already infinity, so you don’t need to change anything on your engine nodes. If you are using a different OS, you need to verify the setting. One way to configure it on AWS is via an EC2 user-data script.
#!/bin/bash
mkdir -p /etc/systemd/system/containerd.service.d
printf '[Service]\nLimitMEMLOCK=infinity\n' \
  > /etc/systemd/system/containerd.service.d/memlock.conf
systemctl daemon-reload
systemctl restart containerd

Google GKE

GKE node system configuration does not expose memlock, so apply the drop-in with a privileged DaemonSet that writes it to each node and restarts containerd. The DaemonSet reapplies it to new nodes as they join, which covers upgrades and autoscaling:
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: engine-memlock
  namespace: kube-system
spec:
  selector:
    matchLabels:
      app: engine-memlock
  template:
    metadata:
      labels:
        app: engine-memlock
    spec:
      hostPID: true
      # Restrict this to your engine node pool with a matching nodeSelector.
      containers:
        - name: set-memlock
          image: debian:stable-slim
          securityContext:
            privileged: true
          command: ["nsenter", "--target", "1", "--mount", "--uts", "--ipc", "--net", "--pid", "--", "sh", "-c"]
          args:
            - |
              mkdir -p /etc/systemd/system/containerd.service.d
              printf '[Service]\nLimitMEMLOCK=infinity\n' \
                > /etc/systemd/system/containerd.service.d/memlock.conf
              if ! systemctl show containerd --property=LimitMEMLOCK | grep -q infinity; then
                systemctl daemon-reload && systemctl restart containerd
              fi
              sleep infinity
The guard around the restart matters: restarting containerd also restarts this pod, and the check lets the replacement pod skip a second restart once the limit is already in place. GKE Autopilot blocks privileged DaemonSets, so you cannot use this approach there directly. Either request a privileged-workload allowlist (--autopilot-privileged-admission) or run your engine node pool on GKE Standard.

Microsoft Azure (AKS)

AKS does not expose LimitMEMLOCK through its supported node configuration, so apply the same privileged DaemonSet shown for Google GKE, or bake the drop-in into a custom node image. Either way, it reapplies automatically to nodes added by node image upgrades and cluster autoscaling, so you do not have to touch nodes by hand.

Install the CRDs

The Firebolt Operator comes with three CustomResourceDefinitions:
  • FireboltInstance
  • FireboltEngine
  • FireboltEngineClass
We provide two ways to install the CRDs. Either via a separate CRD chart or via the Firebolt Operator chart crds/ directory. If you would like to learn more about the pros and cons of each option, please refer to the Helm chart best practices for CRDs. Install the CRD chart before installing the Firebolt Operator chart. This is the recommended option as it allows you full control over the CRDs lifecycle.
helm upgrade --install firebolt-crds oci://ghcr.io/firebolt-db/helm-charts/firebolt-operator-crds

Option 2: Firebolt Operator chart crds/ directory

The Firebolt Operator chart bundles CRDs in its crds/ directory. Helm gives this directory special handling: on helm install, CRDs in crds/ are installed before the chart templates are rendered. If a CRD already exists, Helm skips it with a warning. You can opt out of this bundled CRD installation with --skip-crds. Helm does not upgrade or delete CRDs from a chart’s crds/ directory (this is the trade-off the Helm CRD best practices caution against for ongoing use). Prefer Option 1 for any deployment that will need to upgrade the operator over time.

Install the Firebolt Operator

Install the Firebolt Operator controller:
helm upgrade --skip-crds --install firebolt-operator oci://ghcr.io/firebolt-db/helm-charts/firebolt-operator
Omit --skip-crds if you chose the bundled crds/ directory option above.

Multi-tenant install: scope the Firebolt Operator to specific namespaces

By default the Firebolt Operator watches every namespace and the chart renders a ClusterRole plus ClusterRoleBinding so it can read and write the resources it manages anywhere in the cluster. If you only want the Firebolt Operator to act in a fixed set of namespaces, list them under watchNamespaces:
helm upgrade --skip-crds --install firebolt-operator \
  oci://ghcr.io/firebolt-db/helm-charts/firebolt-operator \
  --set 'watchNamespaces={tenant-a,tenant-b}'
The chart then renders a Role plus RoleBinding in each listed namespace (carrying the same rule set the ClusterRole would have) and starts the manager with --namespaces=tenant-a,tenant-b so its cache only spans those namespaces. The Firebolt Operator’s blast radius is bounded to that list; CRs created in other namespaces are silently ignored. To onboard a new tenant namespace, add it to watchNamespaces and re-run helm upgrade. The new Role and RoleBinding land in the new namespace and the manager restart picks up the extended flag. If you also need the apiserver pod-proxy permission (because you set FireboltInstance.spec.metricScrapeMode=ApiserverProxy on any instance), enable the matching opt-in:
--set rbac.apiserverProxyGrant=true
This renders a dedicated ClusterRole (or per-namespace Role when watchNamespaces is set) that grants only pods/proxy: get. The default metricScrapeMode=PodIP does not need this permission.

Firebolt Operator flags

The Firebolt Operator supports these runtime flags. The binary default is what the manager uses when you run it directly. The Helm chart default is what the firebolt-operator chart passes with its default values.yaml.
FlagBinary defaultHelm chart defaultDescription
--versionfalseNot setPrint the version and exit.
--namespaces""Derived from watchNamespaces (omitted when empty)Comma-separated list of namespaces to watch. Empty watches every namespace (cluster-wide install, requires the chart’s ClusterRole). A non-empty list confines the manager cache to those namespaces and pairs with per-namespace Role and RoleBinding from the chart.
--metrics-bind-address0:8443Address for the metrics endpoint. Use 0 to disable metrics.
--metrics-securetruetrueServe metrics over HTTPS with Kubernetes authentication and authorization.
--metrics-cert-path""Not setDirectory that contains the metrics server certificate.
--metrics-cert-nametls.crtNot setMetrics server certificate file name.
--metrics-cert-keytls.keyNot setMetrics server key file name.
--health-probe-bind-address:8081:8081Address for health probes.
--leader-electfalsetrueEnable leader election for HA deployments.
--enable-webhookstruefalseEnable the admission webhook server.
--webhook-cert-path""Not setDirectory that contains the webhook certificate. The chart sets /tmp/k8s-webhook-server/serving-certs when webhook.enabled=true.
--webhook-cert-nametls.crtNot setWebhook certificate file name.
--webhook-cert-keytls.keyNot setWebhook key file name.
--enable-http2falseNot setEnable HTTP/2 for the metrics and webhook servers.
--engine-max-cpu""Not setMaximum allowed engine-container CPU request and limit (resolved from FireboltEngine.spec.template.spec.containers[engine].resources or the referenced FireboltEngineClass’s container resources). Empty disables the bound.
--engine-max-memory""Not setMaximum allowed engine-container memory request and limit (same resolution as --engine-max-cpu). Empty disables the bound.
--engine-max-ephemeral-storage""Not setMaximum allowed engine-container ephemeral-storage request and limit (same resolution as --engine-max-cpu). Empty disables the bound.
--zap-develfalseNot setEnable controller-runtime development logging defaults.
--zap-encoderjsonjsonLog encoding. Valid values are json and console.
--zap-log-levelinfoinfoMinimum log level. Valid values include debug, info, error, and panic.
--zap-stacktrace-levelerrorerrorLevel at and above which stack traces are captured.
--zap-time-encodingrfc3339Not setTimestamp encoding for zap logs.

Next step

After installation, follow the quickstart to create your first FireboltEngine.