Install with Helm¶
This guide provides step-by-step instructions to install F5 AI Gateway on a Kubernetes cluster using Helm.
Warning
For security, never expose AI Gateway directly on the internet. Instead, use the NGINX Ingress Controller or another similar reverse proxy in front of the AI Gateway. See the Expose with NGINX Ingress Controller guide.
About Helm¶
Helm charts are pre-configured packages of Kubernetes resources deployed with a single command, which allow you to define, install, and upgrade Kubernetes applications.
They are composed of a set of files that describe a related group of Kubernetes resources, including deployments, services, and ingress. Helm charts can define and manage dependencies between various applications, allowing for the development of complex, multi-tier applications.
Before you begin¶
To install AI Gateway using a Helm chart, you need:
Requirements |
Notes |
---|---|
Kubernetes 1.25.0 or later (linux/amd64 or linux/arm64) |
Ensure your client can access the Kubernetes API server. |
kubectl |
|
Helm 3.10.0 or later |
Install the chart¶
Note: Throughout the installation instructions, we will use the ai-gateway
namespace for creating AI Gateway-related
resources and installing the helm chart.
Create ai-gateway namespace¶
kubectl create ns ai-gateway
Create a private registry and license secrets¶
You must create a secret to pull the AI Gateway images from the F5 private registry. Additionally, as AI Gateway requires a license to run, you must create a secret with your license.
Get your license JWT.
Create a Kubernetes
docker-registry
secret type on the cluster in theai-gateway
namespace, using the contents of the JWT token as the username andnone
for password (as the password is not used). The name of the docker server isprivate-registry.f5.com
.kubectl -n ai-gateway create secret docker-registry f5-registry-secret --docker-server=private-registry.f5.com --docker-username=<JWT Token> --docker-password=none
It is important that the
--docker-username=<JWT Token>
contains the contents of the token and is not pointing to the token itself. When you copy the contents of the JWT token, ensure there are no additional characters such as extra whitespaces. This can invalidate the token, causing 401 errors when trying to authenticate to the registry.Inspect and verify the details of the created secret by running:
kubectl get secret f5-registry-secret -n ai-gateway --output=yaml
Similarly to step 2, create a secret with the license using the same JWT token:
kubectl -n ai-gateway create secret generic f5-license --from-literal=token=<JWT Token>
Install the chart from the OCI registry¶
Login into the Helm chart repository with the JWT token used in the previous step.
helm registry login private-registry.f5.com/aigw -u <JWT Token> -p none
To install the latest stable release of AI Gateway from the OCI registry in the
ai-gateway
namespace, run the following command, specifying the registry and license secrets:
helm install aigw oci://private-registry.f5.com/aigw/charts/aigw -n ai-gateway --set "imagePullSecrets[0].name=f5-registry-secret"
The parameter aigw
specifies the release name, and can be changed to any name
you prefer. This name is added as a prefix to the deployments name.
To wait for the deployment to be ready, you can either add the --wait
flag to the
command, or run the following command:
kubectl wait --timeout=5m -n ai-gateway deployment/aigw --for=condition=Available
Configuration¶
The following table lists the configurable parameters of the AI Gateway chart and their default values.
Key |
Type |
Default |
Description |
---|---|---|---|
aigw.affinity |
object |
|
Affinity rules for the aigw pods |
aigw.annotations |
object |
|
Annotations for the aigw pods |
aigw.containerSecurityContext |
object |
|
Security context for the aigw pods |
aigw.enabled |
bool |
|
Enable the core (aigw) application |
aigw.env |
list |
|
Configure additional environment variables for the aigw deployment |
aigw.exporter.azblobAccountURL |
string |
|
URL of the Azure Blob Storage account where transactions will be published. For example: https://myaccount.blob.core.windows.net |
aigw.exporter.azblobContainer |
string |
|
Name of the Azure Blob Storage container to publish transactions into |
aigw.exporter.azblobMaxRetries |
int |
|
The maximum number of retries to attempt to upload a transaction before dropping. |
aigw.exporter.azblobUploadPath |
string |
|
The path inside the Azure Blob Storage container to upload transactions to. For example: data/transactions |
aigw.exporter.azblobUploadTimeout |
string |
|
Timeout for uploading a single transaction to Azure Blob Storage |
aigw.exporter.enabled |
bool |
|
Enable audit exporter |
aigw.exporter.maxPendingTransactions |
int |
|
Sets the maximum number of pending transactions. If the limit is reached, new transactions are not exported (the data is lost) until the pending transaction count drops below the limit |
aigw.exporter.s3Bucket |
string |
|
Name of S3 bucket to export to |
aigw.exporter.s3UploadTimeout |
string |
|
Timeout for uploading a single transaction to S3 |
aigw.exporter.s3UsePathStyle |
bool |
|
Enables the path-style addressing. AWS S3 doesn’t need it but some S3-compatible stores might require it. Read more https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#path-style-access |
aigw.exporter.type |
string |
|
Type of exporter (stdout, s3, azblob) |
aigw.exporter.workers |
int |
|
Number of workers for the exporter |
aigw.healthServer.port |
int |
|
Configure the port of the health server |
aigw.image.pullPolicy |
string |
|
|
aigw.image.repository |
string |
|
Repository for the aigw image |
aigw.image.tag |
string |
|
Version tag for the aigw image |
aigw.imagePullSecrets |
list |
|
Array of imagePullSecrets for pulling aigw image from private registries |
aigw.nodeSelector |
object |
|
Node selector for scheduling the aigw pods |
aigw.replicas |
int |
|
Number of replicas for the aigw deployment |
aigw.resources |
object |
|
Resource requests and limits for the aigw container |
aigw.securityContext |
object |
|
Security context for the aigw deployment |
aigw.service.annotations |
object |
|
Annotations for the service |
aigw.service.clusterIP |
string |
|
When |
aigw.service.enabled |
bool |
|
Enable the service |
aigw.service.externalTrafficPolicy |
string |
|
When |
aigw.service.loadBalancerIP |
string |
|
When using |
aigw.service.port |
int |
|
Port for the service |
aigw.service.type |
string |
|
Type of services for the service |
aigw.tls.enabled |
bool |
|
Enable serving HTTPS for the aigw deployment |
aigw.tls.secretName |
string |
|
Name of the secret that contains the TLS data |
aigw.tolerations |
list |
|
Tolerations for the aigw pods |
aigw.volumeMounts |
list |
|
Additional volume mounts for the aigw deployment |
aigw.volumes |
list |
|
Additional volumes for the aigw deployment |
config.contents |
string |
|
The contents of an aigw.yaml configuration file |
config.create |
bool |
|
Enable creation of the AI Gateway |
config.name |
string |
|
Name of ConfigMap to use |
imagePullSecrets |
list |
|
Array of imagePullSecrets configured on the ServiceAccount for pulling images from private registries |
license.secretKey |
string |
|
Key of the secret which contains the license data |
license.secretName |
string |
|
Name of the secret that contains the license data |
metrics.endpoint |
string |
|
OpenTelemetry GRPC endpoint to export metrics to. |
processorLabs.dataSecurity.affinity |
object |
|
Affinity rules for the processor-labs-data-security pods |
processorLabs.dataSecurity.annotations |
object |
|
Annotations for the processor-labs-data-security pods |
processorLabs.dataSecurity.containerSecurityContext |
object |
|
Security context for the processor-labs-data-security pods |
processorLabs.dataSecurity.enabled |
bool |
|
Enable the Data Security lab processor application |
processorLabs.dataSecurity.env |
list |
|
Configure additional environment variables for the processor-labs-data-security deployment |
processorLabs.dataSecurity.image.pullPolicy |
string |
|
|
processorLabs.dataSecurity.image.repository |
string |
|
Repository for the processor-labs-data-security image |
processorLabs.dataSecurity.image.tag |
string |
|
Version tag for the processor-labs-data-security image |
processorLabs.dataSecurity.imagePullSecrets |
list |
|
Array of imagePullSecrets for pulling processor-labs-data-security image from private registries |
processorLabs.dataSecurity.nodeSelector |
object |
|
Node selector for scheduling the processor-labs-data-security pods |
processorLabs.dataSecurity.replicas |
int |
|
Number of replicas for the processor-labs-data-security deployment |
processorLabs.dataSecurity.resources |
object |
|
Resource requests and limits for the processor-labs-data-security pods |
processorLabs.dataSecurity.securityContext |
object |
|
Security context for the processor-labs-data-security deployment |
processorLabs.dataSecurity.service.annotations |
object |
|
Annotations for the service |
processorLabs.dataSecurity.service.clusterIP |
string |
|
When |
processorLabs.dataSecurity.service.enabled |
bool |
|
Enable the service |
processorLabs.dataSecurity.service.externalTrafficPolicy |
string |
|
When |
processorLabs.dataSecurity.service.loadBalancerIP |
string |
|
When using |
processorLabs.dataSecurity.service.port |
int |
|
Port for the service |
processorLabs.dataSecurity.service.type |
string |
|
Type of services for the service |
processorLabs.dataSecurity.tls.enabled |
bool |
|
Enable serving HTTPS for the processor-labs-data-security deployment |
processorLabs.dataSecurity.tls.secretName |
string |
|
Name of the secret that contains the TLS data |
processorLabs.dataSecurity.tolerations |
list |
|
Tolerations for the processor-labs-data-security pods |
processorLabs.dataSecurity.volumeMounts |
list |
|
Additional volume mounts for the processor-labs-data-security deployment |
processorLabs.dataSecurity.volumes |
list |
|
Additional volumes for the processor-labs-data-security deployment |
processorLabs.promptGuard.affinity |
object |
|
Affinity rules for the processor-labs-prompt-guard pods |
processorLabs.promptGuard.annotations |
object |
|
Annotations for the processor-labs-prompt-guard pods |
processorLabs.promptGuard.containerSecurityContext |
object |
|
Security context for the processor-labs-prompt-guard pods |
processorLabs.promptGuard.enabled |
bool |
|
Enable the Prompt Guard lab processor application |
processorLabs.promptGuard.env |
list |
|
Configure additional environment variables for the processor-labs-prompt-guard deployment |
processorLabs.promptGuard.gpu.enabled |
bool |
|
Enable GPU usage for supported processors in thdeployment. Should be used along with setting a request for |
processorLabs.promptGuard.image.pullPolicy |
string |
|
|
processorLabs.promptGuard.image.repository |
string |
|
Repository for the processor-labs-prompt-guard image |
processorLabs.promptGuard.image.tag |
string |
|
Version tag for the processor-labs-prompt-guard image |
processorLabs.promptGuard.imagePullSecrets |
list |
|
Array of imagePullSecrets for pulling processor-labs-prompt-guard image from private registries |
processorLabs.promptGuard.nodeSelector |
object |
|
Node selector for scheduling the processor-labs-prompt-guard pods |
processorLabs.promptGuard.replicas |
int |
|
Number of replicas for the processor-labs-prompt-guard deployment |
processorLabs.promptGuard.resources |
object |
|
Resource requests and limits for the processor-labs-prompt-guard pods |
processorLabs.promptGuard.securityContext |
object |
|
Security context for the processor-labs-prompt-guard deployment |
processorLabs.promptGuard.service.annotations |
object |
|
Annotations for the service |
processorLabs.promptGuard.service.clusterIP |
string |
|
When |
processorLabs.promptGuard.service.enabled |
bool |
|
Enable the service |
processorLabs.promptGuard.service.externalTrafficPolicy |
string |
|
When |
processorLabs.promptGuard.service.loadBalancerIP |
string |
|
When using |
processorLabs.promptGuard.service.port |
int |
|
Port for the service |
processorLabs.promptGuard.service.type |
string |
|
Type of services for the service |
processorLabs.promptGuard.tls.enabled |
bool |
|
Enable serving HTTPS for the processor-labs-prompt-guard deployment |
processorLabs.promptGuard.tls.secretName |
string |
|
Name of the secret that contains the TLS data |
processorLabs.promptGuard.tolerations |
list |
|
Tolerations for the processor-labs-prompt-guard pods |
processorLabs.promptGuard.volumeMounts |
list |
|
Additional volume mounts for the processor-labs-prompt-guard deployment |
processorLabs.promptGuard.volumes |
list |
|
Additional volumes for the processor-labs-prompt-guard deployment |
processors.f5.affinity |
object |
|
Affinity rules for the aigw-processors-f5 pods |
processors.f5.annotations |
object |
|
Annotations for the aigw-processors-f5 pods |
processors.f5.containerSecurityContext |
object |
|
Security context for the aigw-processors-f5 pods |
processors.f5.enabled |
bool |
|
Enable the F5 processors (aigw-processors-f5) application |
processors.f5.env |
list |
|
Configure additional environment variables for the aigw-processors-f5 deployment |
processors.f5.gpu.enabled |
bool |
|
Enable GPU usage for supported processors in the aigw-processors-f5 deployment. Should be used along with setting a request for |
processors.f5.image.pullPolicy |
string |
|
|
processors.f5.image.repository |
string |
|
Repository for the aigw-processors-f5 image |
processors.f5.image.tag |
string |
|
Version tag for the aigw-processors-f5 image |
processors.f5.imagePullSecrets |
list |
|
Array of imagePullSecrets for pulling aigw-processors-f5 image from private registries |
processors.f5.nodeSelector |
object |
|
Node selector for scheduling the aigw-processors-f5 pods |
processors.f5.replicas |
int |
|
Number of replicas for the aigw-processors-f5 deployment |
processors.f5.resources |
object |
|
Resource requests and limits for the aigw-processors-f5 container |
processors.f5.securityContext |
object |
|
Security context for the aigw-processors-f5 deployment |
processors.f5.service.annotations |
object |
|
Annotations for the service |
processors.f5.service.clusterIP |
string |
|
When |
processors.f5.service.enabled |
bool |
|
Enable the service |
processors.f5.service.externalTrafficPolicy |
string |
|
When |
processors.f5.service.loadBalancerIP |
string |
|
When using |
processors.f5.service.port |
int |
|
Port for the service |
processors.f5.service.type |
string |
|
Type of services for the service |
processors.f5.tls.enabled |
bool |
|
Enable serving HTTPS for the aigw-processors-f5 deployment |
processors.f5.tls.secretName |
string |
|
Name of the secret that contains the TLS data |
processors.f5.tolerations |
list |
|
Tolerations for the aigw-processors-f5 pods |
processors.f5.volumeMounts |
list |
|
Additional volume mounts for the aigw-processors-f5 deployment |
processors.f5.volumes |
list |
|
Additional volumes for the aigw-processors-f5 deployment |
serviceAccount.annotations |
object |
|
Annotations for the AI Gateway service account |
serviceAccount.create |
bool |
|
Enable creation of the AI Gateway service account |
serviceAccount.name |
string |
|
Service account name to be used |
tracing.endpoint |
string |
|
OpenTelemetry GRPC endpoint to export traces to. |
GPU support configuration¶
Some processors benefit from improved performance when deployed with access to a GPU. More information is available in the processor GPU support section.
Upgrade the chart¶
Upgrade the chart from the OCI registry¶
To upgrade the release aigw
, run:
helm upgrade aigw oci://private-registry.f5.com/aigw/charts/aigw -n ai-gateway
Note
If you have used a different release name, replace aigw
with the name you used.
This will upgrade to the latest stable release.
Uninstall the chart¶
To uninstall and delete the release aigw
, run:
helm uninstall aigw -n ai-gateway
kubectl delete ns ai-gateway
Warning
These commands will delete all resources associated with the release, including the namespace. Ensure you have backed up any data you want to keep before running.