Configure profiles

The profiles section is the where the primary pipeline for a request and response is defined. It defines a set of rules that determine the following:

  • The processors which will be applied to the request

  • The AI model to which the request should routed

  • The processors which will be applied to the output of the AI model.

You can use headers and JWT claims to select the appropriate profile based on the request. See the Use JWTs and request headers for profile selection section of the policies page for more information.

Profile configuration

Each profile has the following top-level settings:

Setting

Description

Required

Examples

name

The name of the profile. Used in the policies section of the configuration to attach the profile to a policy.

Yes

name: my-profile

inputStages

Defines the processors that apply to the input before sending it to the AI model.

No

services

Defines the services that process the request. This section is where requests are routed to AI models

Yes

responseStages

Defines the processors that apply to the output before sending it to the client.

No

models

The list of supported models. If the list is configured, every client request must include a model from this list, otherwise, AI Gateway will return a “404 NotFound” response. If the list is empty, client requests must not indicate a specific model, otherwise AI Gateway will return a “400 Bad Request” response.

No

Stage configuration

A stage is a group of steps, where each step involves a processor call. Stages under the inputStages key define groups of checks to be performed on the request. Stages under the outputStages key define groups of checks to be performed on the response.

Each of the inputStages and responseStages sections includes the following settings:

Setting

Description

Required

Examples

name

The name of the stage. This name is for reference only and is not used elsewhere in the configuration.

Yes

name: analyze

concurrency

By default, the processors run one after another. Setting concurrency to parallel runs the processors concurrently.

No

concurrency: parallel

selector

An operand and a selection criteria. Tag selectors and Model selectors are supported here. Tags are added by processors.

No

steps

The list of processors to apply to the input or output.

Yes

Concurrency and ordering

Stages are the unit of concurrency in a profile. Stages are always executed in order. However, the steps defined inside a stage may be run in parallel or sequentially based on the concurrency key described in the table below.

Note

If the processors run using the parallel concurrency option, they cannot be configured with modify: true in their parameters. They are allowed to add tags (annotate: true) and reject (reject: true).

Conditionals

Stages may entered or skipped conditionally based on tags added to the request or response by processors:

inputStages:
  - name: analyze
    steps:
      - name: language-id
  - name: enforce-prompt-en
    selector:
      tags:
        - language:en
    steps:
      - name: system-prompt

In the above example:

  1. The language-id processor is invoked as part of the “analyze” stage. It tags the request with language:en.

  2. The enforce-prompt stage is is invoked only if the request has tag language:en. You might want to do something like this if the system-prompt processor is adding a system prompt that might be language specific.

Stages can also be selected based on the user-defined model defined in the request body if using model-based routing.

Overriding processor paremeters in a step

All processors, including their default parameters, must be defined in the processors section of the configuration. Processors referenced in a step will run according to those default parameters:

steps:
  - name: system-prompt

However, it is possible to override keys in params in the step context as well:

steps:
  - name: system-prompt
    params:
        - rules:
          - |
            Reply in English.

The steps configuration in the example will override the rules param for this specific use of the processor and leave the rest of the keys under params using the defaults, previously defined in the processors section.

Service configuration

Each services section includes the following settings:

Setting

Description

Required

name

The name of the service to use.

Yes

selector

An operand and a selection criteria. Tag selectors and Model selectors are supported here. Tags are added by processors.

No

Service configuration

Example profiles section YAML:

profiles:
  - name: llama3

    inputStages:
      - name: protect
        steps:
          - name: prompt-injection

    services:
      - name: ollama/llama3

    responseStages:
      - name: repetition-detect
        steps:
          - name: repetition-detect

  - name: default

    inputStages:
      - name: analyze
        steps:
          - name: language-id
      - name: enforce-prompt-en
        selector:
          tags:
            - language:en
        steps:
          - name: system-prompt

    services:
      - name: ollama/phi
        selector:
          operand: not
          tags:
            - language:es
            - language:fr

      - name: ollama/llama3
        selector:
          tags:
            - language:de

      - name: openai/public
        selector:
          tags:
            - language:en

Selecting a service or stage based on model

AI Gateway can select a service based on the model from a client request body, which is extracted from the corresponding field depending on the schema (for example, OpenAI).

In the example below, if the model from the request body is best, AI Gateway will select the service service-best and the prompt-injection processor will be executed on the client input. For any other model, it will select service-rest and no processor will be executed.

profiles:
  - name: default
    inputStages:
      - name: protect
        steps:
          - name: prompt-injection
        selector:
          type: input.model
          values:
            - best
    services:
      - name: service-best
        selector:
          type: input.model
          values:
            - best
      - name: service-rest

Configuring the supported models is optional and can be done as shown in the example below:

profiles:
  - name: default
    models:
      - name: best
      - name: mini
      - name: old
    services:
      - name: service-best
        selector:
          type: input.model
          values:
            - best
      - name: service-mini
        selector:
          type: input.model
          values:
            - mini
      - name: service-old
        selector:
          type: input.model
          values:
            - old

As a result, AI Gateway will reject (return a “404 NotFound” response) a request with a model that doesn’t match any of the configured ones.