Configure services¶

Services define the upstream LLMs to which the AI Gateway can send traffic. Services are declared under the top-level services key in the configuration file.

Once declared, services must be referenced by name under the services key in a profile in order to start receiving traffic. See Configure Profiles for more details.

Declaring a service¶

Each service includes the following four settings in the configuration file:

Setting	Description	Required	Examples
`name`	The name of the service. Used to reference the service in the configuration file. User-defined and must be unique among the services.	Yes	`name: openai/public`
`type`	The model to use. See OpenAi executor, Ollama executor, and Anthropic executor for guidance on finding valid values. For OpenAI services running on Azure, use the value `azure`.	Yes	`type: gpt-4o` or `type: azure`
`executor`	The executor prepares the request for a specific model’s schema and does the actual request to the model. Valid values: `openai` (use this value for any OpenAI compatible services) `anthropic` `ollama`	Yes	`executor: openai`
`config`	The configuration for the executor, allowing additional key-value pairs to be passed that further configure the executor.	No	`config: { "key": "value" }`

Here is a full example of the most basic service definition:

services:
  - name: openai/public
    type: gpt-4o
    executor: openai

Configuring the executor¶

The config setting is optional and can be used to pass additional key-value pairs to the executor.

Config key	Description	Required	Examples
`endpoint`	The endpoint URL of the service.	No	`endpoint: "https://api.openai.com/v1/chat/completions"`
`secrets`	Defines the source and names of the secrets needed by the service. See Configuring secrets	No	`secrets: <see secrets definition below>`
`caPath`	The path to a CA certificate for the service	No	`caPath: /etc/ssl/certs/private.pem`
`tlsMinVersion`	The minimal acceptable TLS version. Defaults to v1.2	No	`tlsMinVersion: v1.2`
`clientCertPath`	The path to the service client’s certificate. Use when setting up mTLS between AI Gateway and the service.	No	`clientCertPath: /etc/ssl/certs/client.crt`
`clientKeyPath`	The path to the service client’s private key. Use when setting up mTLS between AI Gateway and the service.	No	`clientKeyPath:/etc/ssl/certs/client.pem`
`retries`	Configures request retries for the service.	No	See Retries configuration

There are a few keys that are specific to specific executors:

OpenAI executor-specific configuration
Anthropic executor-specific configuration

Retries configuration¶

The retries field configures how failed HTTP requests to the service are retried.

Note

Route’s timeoutSeconds caps the total time for all retries.

Example: Minimal retries configuration

config:
  retries: {}

Example: Basic retries configuration

In the example below:

Retries are only attempted for the specified status codes.
The total number of attempts is 5.
If the Retry-After header is present and valid, it is used for the delay.
Otherwise, exponential backoff with jitter is used.

config:
  retries:
    statusCodes: [429, 500, 502, 503, 504]
    maxRetryAttempts: 5
    strategies:
      headerBased:
        names: ["Retry-After"]
      backoff:
        initialIntervalMilliseconds: 200

retries

Field	Type	Description	Default
`statusCodes`	array of int	HTTP status codes that should be retried. Allowed range: [400..599]	`[429, 500, 502, 503, 504]`
`maxRetryAttempts`	int	Maximum number of retry attempts. `0` means use default; negative disables the limit.	`3`
`strategies`	strategies	Retry strategies to apply. You can specify one or both: `headerBased` and `backoff`.	`headerBased` and `backoff` with the defaults

strategies

Field	Type	Description	Default
`headerBased`	headerBased	Retries according to headers. Takes priority over backoff.	See headerBased
`backoff`	backoff	Exponential backoff strategy for retries. Each attempt adds jitter and grows by 1.5x from the initial interval, up to 60 seconds.	See backoff

headerBased

Field	Type	Description	Default
`names`	array of string	Header names to use for retry timing (e.g., `Retry-After`). The header value must be an integer (time in seconds) or an HTTP date. See RFC7231.	`["Retry-After"]`

backoff

Field	Type	Description	Default
`initialIntervalMilliseconds`	int	Initial retry interval in milliseconds	`100`

Configuring secrets¶

The secrets key in the config object is a dictionary with the following keys:

Config key	Description	Examples
`source`	The SecretSource used to retrieve the secrets. See the table below for a detailed description of valid SecretSource types.	`source: DotEnv` or `source: "EnvVar"` or `source: "File"`
`targets`	A list of key-value pairs representing secret values to be loaded from the SecretSource. The key is the target in AI Gateway and the value is the value in the `source`. Supported keys: `apiKey`: Sets the API key in the appropriate way for the selected `executor`	`targets:` `apiKey: OPENAI_API_KEY`.
`meta`	Used to provide metadata to the SecretSource. Different sources will have different needs; e.g.: The DotEnvSecretSource needs the path to the .env file.	`meta:` `path: /etc/aigw/secrets/.env`

Config key

Description

Examples

source

The SecretSource used to retrieve the secrets. See the table below for a detailed description of valid SecretSource types.

source: DotEnv or source: "EnvVar" or source: "File"

targets

A list of key-value pairs representing secret values to be loaded from the SecretSource. The key is the target in AI Gateway and the value is the value in the source.
Supported keys:

apiKey: Sets the API key in the appropriate way for the selected executor

targets:
apiKey: OPENAI_API_KEY.

meta

Used to provide metadata to the SecretSource. Different sources will have different needs; e.g.: The DotEnvSecretSource needs the path to the .env file.

meta:
path: /etc/aigw/secrets/.env

The available SecretSource types are:

SecretSource	Description	Valid `meta` values
`DotEnv`	Load secrets from a .env file.	`path`: The path to the .env file.
`EnvVar`	Load secrets from environment variables.	Does not use `meta`.
`File`	Load secrets from a file.	`path`: The path to the directory containing the files.

Example¶

services:
  - name: openai/public
    type: gpt-4o
    executor: openai
    config:
      endpoint: "https://api.openai.com/v1/chat/completions"
      secrets:
        - source: EnvVar
          targets:
            apiKey: OPENAI_API_KEY

  - name: openai/azure
    type: azure
    executor: openai
    config:
      endpoint: "https://myvm.openai.azure.com/openai/deployments/chat/completions"
      apiVersion: 2024-02-15-preview
      secrets:
        - source: File
          targets:
            apiKey: AZURE_API_KEY
          meta:
            path: /etc/aigw/secrets/

  - name: anthropic/sonnet
    type: claude-3-sonnet-20240229
    executor: anthropic
    config:
      anthropicVersion: "2023-06-01"
      secrets:
        - source: DotEnv
          targets:
            apiKey: ANTHROPIC_API
          meta:
            path: /etc/aigw/secrets/.env

  - name: ollama/llama3
    type: llama3
    executor: ollama

  - name: ollama/phi
    type: phi3
    executor: ollama

The example shows the declaration of services. Declared services must be referenced by name under the services key in a profile in order to start receiving traffic. See Configure Profiles for more details.

OpenAI Executor¶

Note

You can use the OpenAI executor to set up models served by the OpenAI platform as well as models served by platforms with OpenAI Chat Completions compliant APIs.

Supported API: OpenAI Chat Completions
Supported models (type): If using the OpenAI Platform, see OpenAI Models. If using an OpenAI compatible model API, check with your platform or LLM maintainer for valid values.

To configure an OpenAI service, you need to provide the API key and the model name. The following example shows how to configure the OpenAI GPT-4o service in the configuration file:

services:
  - name: openai/public
    type: gpt-4o
    executor: openai
    config:
      endpoint: "https://api.openai.com/v1/chat/completions"
      secrets:
        - source: EnvVar
          targets:
            apiKey: OPENAI_API_KEY

OpenAI executor-specific configuration¶

Under the config key, the following keys may be supplied:

Config key	Description	Required	Examples
`apiVersion`	Set only if service `type` is set to `azure`. Use the value obtained from Azure AI Studio.	No	`apiVersion: 2023-02-15-preview`

Ollama executor¶

Supported API: Ollama Chat API
Supported models (type): Ollama Models

To configure an Ollama service, you need to have a running Ollama either locally or remotely. The following example shows how to configure the Ollama service using the Phi3 model in the configuration file:

services:
  - name: ollama/phi
    type: phi3
    executor: ollama
    config:
      endpoint: "http://OLLAMA_HOST:11434/api/chat"

If running Ollama locally, see this document to understand how to configure your Ollama endpoint

Anthropic executor¶

Supported API: Anthropic Messages API
Supported models (type): Anthropic Models

To configure an Anthropic service, you need to provide the API key, the model name, and Anthropic version. The following example shows how to configure the Anthropic service using the Claude 3.5 model in the configuration file:

services:
  - name: anthropic/sonnet
    type: claude-3-5-sonnet-20240620
    executor: anthropic
    config:
      endpoint: "https://api.anthropic.com/v1/messages"
      anthropicVersion: "2023-06-01"
      secrets:
        - source: EnvVar
          targets:
            apiKey: ANTHROPIC_API_KEY

Note: Anthropic API version must be enclosed in quotations to avoid config parsing issues

Anthropic executor-specific configuration¶

Under the config key, the following keys may be supplied:

Config key	Description	Required	Examples
`anthropicVersion`	The version of the Anthropic API to use.	No	`anthropicVersion: "2023-06-01"`

Previous Next