DNS resolving and load balancing of processors and services¶
This document explains how AI Gateway resolves the endpoint
of processors
and services and describes whether AI Gateway does any load balancing
among resolved IP addresses.
Endpoint configuration¶
When configuring a processor or a service, you must specify the endpoint. The endpoint can be either an IP address or a DNS name.
Example of a service with a DNS name endpoint:
services:
- name: openai/public
type: gpt-4o
executor: openai
config:
endpoint: https://api.openai.com/v1/chat/completions
. . .
Example of a processor with an IP address endpoint:
processors:
- name: prompt-injection
type: external
config:
endpoint: https://10.0.0.1:8000
. . .
How AI Gateway sends requests to services or processors¶
When AI Gateway handles a client request and needs to forward the request to a processor or a service, the following occurs:
If the processor/service endpoint is an IP address:
If an available keep-alive connection to that address already exists, AI Gateway reuses it to send the request.
If no connection exists:
AI Gateway attempts to establish a new connection.
If the connection attempt fails, the request fails.
If the connection attempt succeeds, AI Gateway sends the request through this connection and adds the connection to the keep-alive connection pool.
If the endpoint is a DNS name (host):
If an available keep-alive connection to that host already exists, AI Gateway reuses it to send the request.
If no connection exists:
AI Gateway resolves the host using the system resolver.
It attempts to establish a connection to the first IP address returned by the resolver. If this attempt fails, it tries the next IP address, and so on, until it successfully establishes a connection or tries all addresses. If all attempts fail or the route’s
timeoutSeconds
expires, the request fails.Once AI Gateway establishes a connection, it sends the request through this connection and adds the connection to the keep-alive pool.
Keep-alive connections¶
AI Gateway supports HTTP keep-alive connections if processors and services also support them. F5 processors support keep-alive connections.
Warning
Known Issue: AI Gateway has a bug that breaks keep-alive connections.
System resolver¶
AI Gateway uses the default Kubernetes DNS server to resolve hostnames. See Kubernetes doc for more information.
Note
AI Gateway does not cache DNS records. Every time it establishes a new connection, it performs a DNS resolution.
Load balancing¶
AI Gateway does not load balance among the resolved IP addresses.
To support load balancing:
Processors: For processors running in the same Kubernetes cluster, AI Gateway connects to the Cluster (Virtual) IP of the processor’s Kubernetes service, and Kubernetes does load balancing of connections among the processor pods.
Services:
Consider using a proxy such as NGINX between AI Gateway and the service that will load balance connections or HTTP requests.
For primitive DNS-based load balancing, use a DNS server that shuffles the returned records. This way, AI Gateway selects a different IP address each time it establishes a new connection.