Overview¶
F5 AI Gateway routes generative AI traffic to an appropriate Large Language Model (LLM) backend and protects the traffic against common threats, which includes:
Inspecting and filtering of client requests and LLM responses
Preventing of malicious inputs from reaching an LLM backend
Ensuring that LLM responses are safe to send to clients
Components¶
AI Gateway is deployed on Kubernetes.
AI Gateway consists of two primary components:
Core: A specialized proxy for generative AI traffic that uses one or more processors to enable traffic protection.
Processors: Provide AI-related protection by inspecting client requests and LLM backend responses.
Additionally, we recommend using NGINX Ingress Controller in front of AI Gateway to expose it to your clients.
Core¶
The AI Gateway core handles HTTP(S) requests destined for an LLM backend. It performs the following tasks:
Performs authentication and authorization checks, such as validating JWTs and inspecting request headers.
Parses and performs basic validation on client requests.
Applies processors to incoming requests, which may modify or reject the request.
Selects and routes each request to an appropriate LLM backend, transforming requests/responses to match the LLM/client schema.
Applies processors to the response from the LLM backend, which may modify or reject the response.
Optionally, stores an auditable record of every request/response and the specific activity of each processor. These records can be exported to AWS S3 or S3-compatible storage.
Generates and exports observability data via OpenTelemetry.
Provides a configuration interface (using a config file).
Note
In this documentation, we often use “the core” and “AI Gateway” interchangeably.
Processors¶
A processor runs separately from the core and can perform one or more of the following actions on a request or response:
Modify: A processor may rewrite a request or response. For example, by redacting credit card numbers.
Reject: A processor may reject a request or response, causing the core to halt processing of the given request/response.
Annotate: A processor may add tags or metadata to a request/response, providing additional information to the administrator. The core can also select the LLM backend based on these tags.
Each processor provides specific protection, transformation, or classification capabilities to AI Gateway. For example, a processor can detect prompt injection attacks and block an incoming request. AI Gateway includes several processors.
What’s next¶
Follow the Quick start guide to get started with AI Gateway.
Learn how to Configure the AI Gateway.
Review the Terminology for terms used in this documentation.