Configure profiles¶

The profiles section is the where the primary pipeline for a request and response is defined. It defines a set of rules that determine the following:

The processors which will be applied to the request
The AI model to which the request should routed
The processors which will be applied to the output of the AI model.

You can use headers and JWT claims to select the appropriate profile based on the request. See the Use JWTs and request headers for profile selection section of the policies page for more information.

Profile configuration¶

Each profile has the following top-level settings:

Setting	Description	Required	Examples
`name`	The name of the profile. Used in the `policies` section of the configuration to attach the profile to a policy.	Yes	`name: my-profile`
`inputStages`	Defines the processors that apply to the input before sending it to the AI model.	No
`services`	Defines the services that process the request. This section is where requests are routed to AI models	Yes
`responseStages`	Defines the processors that apply to the output before sending it to the client.	No
`models`	The list of supported models. If the list is configured, every client request must include a model from this list, otherwise, AI Gateway will return a “404 NotFound” response. If the list is empty, client requests must not indicate a specific model, otherwise AI Gateway will return a “400 Bad Request” response.	No

Stage configuration¶

A stage is a group of steps, where each step involves a processor call. Stages under the inputStages key define groups of checks to be performed on the request. Stages under the outputStages key define groups of checks to be performed on the response.

Each of the inputStages and responseStages sections includes the following settings:

Setting	Description	Required	Examples
`name`	The name of the stage. This name is for reference only and is not used elsewhere in the configuration.	Yes	`name: analyze`
`concurrency`	By default, the processors run one after another. Setting `concurrency` to `parallel` runs the processors concurrently.	No	`concurrency: parallel`
`selector`	An operand and a selection criteria. Tag selectors and Model selectors are supported here. Tags are added by processors.	No
`steps`	The list of processors to apply to the input or output.	Yes

[!NOTE] A selectable resource that has no selector defined will always evaluate to true.

Selector matching stops when the first match is found. If you are seeing unexpected behavior, check the ordering of your selectors; please ensure that the most specific selectors are listed first.

Please see selector logic and ordering for full details.

Concurrency and ordering¶

Stages are the unit of concurrency in a profile. Stages are always executed in order. However, the steps defined inside a stage may be run in parallel or sequentially based on the concurrency key described in the table below.

Note

If the processors run using the parallel concurrency option, they cannot be configured with modify: true in their parameters. They are allowed to add tags (annotate: true) and reject (reject: true).

Conditionals¶

Stages may entered or skipped conditionally based on tags added to the request or response by processors:

inputStages:
  - name: analyze
    steps:
      - name: language-id
  - name: enforce-prompt-en
    selector:
      tags:
        - language:en
    steps:
      - name: system-prompt

In the above example:

The language-id processor is invoked as part of the “analyze” stage. It tags the request with language:en.
The enforce-prompt stage is is invoked only if the request has tag language:en. You might want to do something like this if the system-prompt processor is adding a system prompt that might be language specific.

Stages can also be selected based on the user-defined model defined in the request body if using model-based routing.

Overriding processor paremeters in a step¶

All processors, including their default parameters, must be defined in the processors section of the configuration. Processors referenced in a step will run according to those default parameters:

steps:
  - name: system-prompt

However, it is possible to override keys in params in the step context as well:

steps:
  - name: system-prompt
    params:
        - rules:
          - |
            Reply in English.

The steps configuration in the example will override the rules param for this specific use of the processor and leave the rest of the keys under params using the defaults, previously defined in the processors section.

Service configuration¶

Each services section includes the following settings:

Setting	Description	Required
`name`	The name of the service to use.	Yes
`selector`	An operand and a selection criteria. Tag selectors and Model selectors are supported here. Tags are added by processors.	No
`fallbackServiceName`	The name of a fallback service to use if the primary service (defined by the `name` field) fails.	No

Service configuration¶

Example profiles section YAML:

profiles:
  - name: llama3

    inputStages:
      - name: protect
        steps:
          - name: prompt-injection

    services:
      - name: ollama/llama3

    responseStages:
      - name: repetition-detect
        steps:
          - name: repetition-detect

  - name: default

    inputStages:
      - name: analyze
        steps:
          - name: language-id
      - name: enforce-prompt-en
        selector:
          tags:
            - language:en
        steps:
          - name: system-prompt

    services:
      - name: ollama/phi
        selector:
          operand: not
          tags:
            - language:es
            - language:fr

      - name: ollama/llama3
        selector:
          tags:
            - language:de

      - name: openai/public
        selector:
          tags:
            - language:en

Selecting a service or stage based on model¶

AI Gateway can select a service based on the model from a client request body, which is extracted from the corresponding field depending on the schema (for example, OpenAI).

In the example below, if the model from the request body is best, AI Gateway will select the service service-best and the prompt-injection processor will be executed on the client input. For any other model, it will select service-rest and no processor will be executed.

profiles:
  - name: default
    inputStages:
      - name: protect
        steps:
          - name: prompt-injection
        selector:
          type: input.model
          values:
            - best
    services:
      - name: service-best
        selector:
          type: input.model
          values:
            - best
      - name: service-rest

Configuring the supported models is optional and can be done as shown in the example below:

profiles:
  - name: default
    models:
      - name: best
      - name: mini
      - name: old
    services:
      - name: service-best
        selector:
          type: input.model
          values:
            - best
      - name: service-mini
        selector:
          type: input.model
          values:
            - mini
      - name: service-old
        selector:
          type: input.model
          values:
            - old

As a result, AI Gateway will reject (return a “404 NotFound” response) a request with a model that doesn’t match any of the configured ones.

Fallback service configuration¶

The fallbackServiceName field specifes a fallback service to use if the primary service fails (for example, due to network errors or billing problems).

In the following example, AI Gateway sends requests to openai by default. If openai fails, it will send the requests to ollama.

profiles:
  - name: default
    services:
      - name: openai
        fallbackServiceName: ollama 

Previous Next