F5 AI Gateway API Reference¶

Overview¶

F5 AI Gateway provides a Chat Completion API inspired by OpenAI’s Chat API.

A few notable OpenAI features which are not yet supported include the following:

streaming
tool calling
multi-modal input/output ; only Text

Configuration¶

To serve the V1 Chat Completions API set the schema field on a route to the value v1/chat_completions.

routes:
  - path: /api/v1/chat
    schema: v1/chat_completions
    policy: example

To serve the V1 Models API set the schema field on a route to the value v1/models.

routes:
  - path: /api/v1/chat
    schema: v1/models
    policy: example

See the Configure Routes topic for additional details.

To obtain a list of models from the v1/models API, the profiles section must be configured with models.

A simple example is provided below.

See topics Configure profiles and Selecting a service based on model for complete details.

profiles:
  - name: api-models
    models:
      - name: best
    inputStages:
      - name: protect
        steps:
          - name: prompt-injection
    services:
      - name: ollama/qwen
        selector:
          type: input.model
          values:
            - best

POST v1/chat_completions¶

Generate a new Chat Completion.

curl -X POST 127.0.0.1:4141/api/v1/chat -H "Content-Type: application/json" -d '
{
  "messages": [
      {
        "role": "user",
        "content":"In the game of Uno is it a legal move to place a Draw 4 card on top of another Draw 4 card making the next player draw 8 cards?"
      }
  ]
}'

Chat Completion request parameters¶

Field	Type	Description	Required
model	string	ID of the model to use. *It is considered an error to set this field when the selected `profile` has not been configured with `models`.	true*
messages	[]object	A list of message objects comprising the conversation so far.	true
message.role	string	The role of the messages author. The applicable set of roles are determined by the selected `model`.	true
message.content	string	The contents of the message.	true
max_completion_tokens	number	An upper bound for the number of tokens that can be generated for a completion.	false
temperature	number	The sampling temperature. A higher value makes the output more random. A lower value makes it more deterministic. The applicable range of values is determined by the model.	false
top_p	number	An alternative to sampling with temperature, called nucleus sampling. It is generally recommend altering this or temperature but not both.	false
stop	[]string	Up to 4 sequences instructing the model to stop generating further tokens.	false

200 OK¶

A Chat Completion object.

Chat Completion object parameters¶

Field	Type	Description
id	string	A unique identifier for the chat completion.
object	string	The type of objects, almost always “chat_completion”.
model	string	The model used to genertate this chat completion.
choices	[]object	A list of chat completion choice objects.
choice.index	number	The index of the choice in the list of choices.
choice.finish_reason	string	The reason the model stopped generating tokens.
choice.message	object	The chat completion message object generated by the model.
choice.message.role	string	The role of the author of the choice message.
choice.message.content	string	The contents of the choice message.
usage	object	Basic Usage statistics for the completion reques.
usage.prompt_tokens	number	Number of tokens in the prompt.
usage.completion_tokens	number	Number of tokens in the generated completion.
usage.total_tokens	number	Total number of tokens used in the request (prompt + completion).

400 Bad request¶

A Chat Completion error indicating that the request was malformed or invalid.

malformed

{
  "error": {
    "type": "decoding_error",
    "message": "request body must be valid JSON"
  }
}

invalid

{
  "error": {
    "type": "validation_error",
    "message": "request must include at least 1 message"
  }
}

Bad Request parameters¶

Field	Type	Description	Required
type	string	The category of error which was raised.	false
message	string	Details about the error.	false
code	string	Application-specific error code for this error.	false
param	object	Metadata that may be useful for debugging.	false

422 Unprocessable content¶

A Chat Completion error indicating that a request, while valid, has been refused.

{
  "statusCode": 422,
  "type": "message_not_allowed",
  "message": "rejection_reason: Possible Prompt Injection detected"
}

Unprocessable Content parameters¶

Field	Type	Description
statusCode	number	The http status code associated with this error
type	string	The category of refusal
message	string	Details about why the message was refused.

GET V1 models¶

The model configuration for a profile.

curl -X GET  -H "Accept: application/json" http://127.0.0.1:4141/api/v1/models

200 OK¶

{
  "object": "models",
  "data": [
    {
      "id": "best",
      "object": "model"
    }
  ]
}

Models object parameters¶

Field	Type	Description
object	string	The object type, always “models”.
data	[]object	List of model objects.
data.id	string	The model identifier, which can be referenced in requests.
data.object	string	The object type, should always “model”.

Previous Next