F5 AI Gateway API Reference¶
Overview¶
F5 AI Gateway provides a Chat Completion API inspired by OpenAI’s Chat API.
A few notable OpenAI features which are not yet supported include the following:
streaming
tool calling
multi-modal input/output ; only Text
Configuration¶
To serve the V1 Chat Completions
API set the schema
field on a route
to the value v1/chat_completions
.
routes:
- path: /api/v1/chat
schema: v1/chat_completions
policy: example
To serve the V1 Models
API set the schema
field on a route
to the value v1/models
.
routes:
- path: /api/v1/chat
schema: v1/models
policy: example
See the Configure Routes topic for additional details.
To obtain a list of models
from the v1/models
API, the profiles
section must be configured with models
.
A simple example is provided below.
See topics Configure profiles and Selecting a service based on model for complete details.
profiles:
- name: api-models
models:
- name: best
inputStages:
- name: protect
steps:
- name: prompt-injection
services:
- name: ollama/qwen
selector:
type: input.model
values:
- best
POST v1/chat_completions¶
Generate a new Chat Completion.
curl -X POST 127.0.0.1:4141/api/v1/chat -H "Content-Type: application/json" -d '
{
"messages": [
{
"role": "user",
"content":"In the game of Uno is it a legal move to place a Draw 4 card on top of another Draw 4 card making the next player draw 8 cards?"
}
]
}'
Chat Completion request parameters¶
Field |
Type |
Description |
Required |
---|---|---|---|
model |
string |
ID of the model to use. |
true* |
messages |
[]object |
A list of message objects comprising the conversation so far. |
true |
message.role |
string |
The role of the messages author. |
true |
message.content |
string |
The contents of the message. |
true |
max_completion_tokens |
number |
An upper bound for the number of tokens that can be generated for a completion. |
false |
temperature |
number |
The sampling temperature. |
false |
top_p |
number |
An alternative to sampling with temperature, called nucleus sampling. |
false |
stop |
[]string |
Up to 4 sequences instructing the model to stop generating further tokens. |
false |
200 OK¶
A Chat Completion object.
Chat Completion object parameters¶
Field |
Type |
Description |
---|---|---|
id |
string |
A unique identifier for the chat completion. |
object |
string |
The type of objects, almost always “chat_completion”. |
model |
string |
The model used to genertate this chat completion. |
choices |
[]object |
A list of chat completion choice objects. |
choice.index |
number |
The index of the choice in the list of choices. |
choice.finish_reason |
string |
The reason the model stopped generating tokens. |
choice.message |
object |
The chat completion message object generated by the model. |
choice.message.role |
The role of the author of the choice message. |
|
choice.message.content |
The contents of the choice message. |
|
usage |
object |
Basic Usage statistics for the completion reques. |
usage.prompt_tokens |
number |
Number of tokens in the prompt. |
usage.completion_tokens |
number |
Number of tokens in the generated completion. |
usage.total_tokens |
number |
Total number of tokens used in the request (prompt + completion). |
400 Bad request¶
A Chat Completion error indicating that the request was malformed or invalid.
malformed
{
"error": {
"type": "decoding_error",
"message": "request body must be valid JSON"
}
}
invalid
{
"error": {
"type": "validation_error",
"message": "request must include at least 1 message"
}
}
Bad Request parameters¶
Field |
Type |
Description |
Required |
---|---|---|---|
type |
string |
The category of error which was raised. |
false |
message |
string |
Details about the error. |
false |
code |
string |
Application-specific error code for this error. |
false |
param |
object |
Metadata that may be useful for debugging. |
false |
422 Unprocessable content¶
A Chat Completion error indicating that a request, while valid, has been refused.
{
"statusCode": 422,
"type": "message_not_allowed",
"message": "rejection_reason: Possible Prompt Injection detected"
}
Unprocessable Content parameters¶
Field |
Type |
Description |
---|---|---|
statusCode |
number |
The http status code associated with this error |
type |
string |
The category of refusal |
message |
string |
Details about why the message was refused. |
GET V1 models¶
The model
configuration for a profile
.
curl -X GET -H "Accept: application/json" http://127.0.0.1:4141/api/v1/models
200 OK¶
{
"object": "models",
"data": [
{
"id": "best",
"object": "model"
}
]
}
Models object parameters¶
Field |
Type |
Description |
---|---|---|
object |
string |
The object type, always “models”. |
data |
[]object |
List of model objects. |
data.id |
string |
The model identifier, which can be referenced in requests. |
data.object |
string |
The object type, should always “model”. |