Configure profiles¶
The profiles
section is the where the primary pipeline for a request and response is defined. It defines a set of rules that determine the following:
The processors which will be applied to the request
The AI model to which the request should routed
The processors which will be applied to the output of the AI model.
You can use headers and JWT claims to select the appropriate profile based on the request. See the Use JWTs and request headers for profile selection section of the policies page for more information.
Profile configuration¶
Each profile has the following top-level settings:
Setting |
Description |
Required |
Examples |
---|---|---|---|
|
The name of the profile. Used in the |
Yes |
|
|
Defines the processors that apply to the input before sending it to the AI model. |
No |
|
|
Defines the services that process the request. This section is where requests are routed to AI models |
Yes |
|
|
Defines the processors that apply to the output before sending it to the client. |
No |
|
|
The list of supported models. If the list is configured, every client request must include a model from this list, otherwise, AI Gateway will return a “404 NotFound” response. If the list is empty, client requests must not indicate a specific model, otherwise AI Gateway will return a “400 Bad Request” response. |
No |
Stage configuration¶
A stage
is a group of steps
, where each step
involves a processor call. Stages under the inputStages
key define groups of checks to be performed
on the request. Stages under the outputStages
key define groups of checks to be performed on the response.
Each of the inputStages
and responseStages
sections includes the following
settings:
Setting |
Description |
Required |
Examples |
---|---|---|---|
|
The name of the stage. This name is for reference only and is not used elsewhere in the configuration. |
Yes |
|
|
By default, the processors run one after another. Setting |
No |
|
|
An operand and a selection criteria. Tag selectors and Model selectors are supported here. Tags are added by processors. |
No |
|
|
The list of processors to apply to the input or output. |
Yes |
Concurrency and ordering¶
Stages are the unit of concurrency in a profile. Stages are always executed in order. However, the step
s defined inside a stage
may be run in parallel or
sequentially based on the concurrency
key described in the table below.
Note
If the processors run using the parallel
concurrency option, they cannot be configured with modify: true
in their parameters. They are allowed to add tags (annotate: true
) and reject (reject: true
).
Conditionals¶
Stages may entered or skipped conditionally based on tags added to the request or response by processors:
inputStages:
- name: analyze
steps:
- name: language-id
- name: enforce-prompt-en
selector:
tags:
- language:en
steps:
- name: system-prompt
In the above example:
The
language-id
processor is invoked as part of the “analyze” stage. It tags the request withlanguage:en
.The
enforce-prompt
stage is is invoked only if the request has taglanguage:en
. You might want to do something like this if thesystem-prompt
processor is adding a system prompt that might be language specific.
Stages can also be selected based on the user-defined model
defined in the request body if using model-based routing.
Overriding processor paremeters in a step¶
All processors, including their default parameters, must be defined in the processors section
of the configuration.
Processors referenced in a step
will run according to those default parameters:
steps:
- name: system-prompt
However, it is possible to override keys in params
in the step
context as well:
steps:
- name: system-prompt
params:
- rules:
- |
Reply in English.
The steps
configuration in the example will override the rules
param for this specific use of the processor and leave the rest of the keys under params
using the defaults, previously defined in the processors
section.
Service configuration¶
Each services
section includes the following settings:
Setting |
Description |
Required |
---|---|---|
|
The name of the service to use. |
Yes |
|
An operand and a selection criteria. Tag selectors and Model selectors are supported here. Tags are added by processors. |
No |
Service configuration¶
Example profiles section YAML:
profiles:
- name: llama3
inputStages:
- name: protect
steps:
- name: prompt-injection
services:
- name: ollama/llama3
responseStages:
- name: repetition-detect
steps:
- name: repetition-detect
- name: default
inputStages:
- name: analyze
steps:
- name: language-id
- name: enforce-prompt-en
selector:
tags:
- language:en
steps:
- name: system-prompt
services:
- name: ollama/phi
selector:
operand: not
tags:
- language:es
- language:fr
- name: ollama/llama3
selector:
tags:
- language:de
- name: openai/public
selector:
tags:
- language:en
Selecting a service or stage based on model¶
AI Gateway can select a service based on the model from a client request body, which is extracted from the corresponding field depending on the schema (for example, OpenAI).
In the example below, if the model from the request body is best
, AI Gateway will select the service service-best
and the prompt-injection
processor will be executed on the client input. For any other model, it will select service-rest
and no processor will be executed.
profiles:
- name: default
inputStages:
- name: protect
steps:
- name: prompt-injection
selector:
type: input.model
values:
- best
services:
- name: service-best
selector:
type: input.model
values:
- best
- name: service-rest
Configuring the supported models is optional and can be done as shown in the example below:
profiles:
- name: default
models:
- name: best
- name: mini
- name: old
services:
- name: service-best
selector:
type: input.model
values:
- best
- name: service-mini
selector:
type: input.model
values:
- mini
- name: service-old
selector:
type: input.model
values:
- old
As a result, AI Gateway will reject (return a “404 NotFound” response) a request with a model that doesn’t match any of the configured ones.