Terminology

This section provides definitions for terms used throughout the F5 AI Gateway documentation. The definitions provided are specific to the context of F5 AI Gateway and may not align with general usage.

Understanding these terms will help you navigate the documentation and understand how each of the components of F5 AI Gateway work together.

Term

Definition

AI (Artificial Intelligence)

A general term encompassing the field of computer science research focused on machine learning, large language models (LLMs), and related technologies. Often used when precise terminology is not required.

Backend

Refers to a specific implementation of a LLM that is exposed over a network API interface. Examples of backends include ChatGPT 4, which is exposed via the Open AI API, and Claude v2 which is exposed via the Anthropic API. Additionally, a test harness that provides fake outputs could also be considered a backend because it is basically anything that an AI gateway would proxy.

Chain

A chain is a linkage of processing commands or steps in a gateway. The term describes how there is a sequential set of steps that need to be executed in order for a given input (or request) to be passed to a backend system. From here the chain continues by taking the output from the backend system and then processing it through a series of steps until it is returned back to the original requester.

Client

In the context of F5 AI Gateway, a client is a requester. This may be a human being, a software system, or some set of actions combining both. Typically, a client will initiate a request by submitting a prompt and metadata to a LLM. Then, the client will wait for a response to return from the LLM. From the perspective of a client, they are generally not aware of any proxying layers such as F5 AI Gateway being present.

Context

For LLMs, “context” refers to the surrounding information, or background related to a given input, output, or interaction. Context is the combination of an arbitrary number of correlated elements such as prompts (including the system prompts), responses, as well as any metadata pulled into an interaction. In other words, context refers to the association between a given prompt and its history of previous prompts and responses, but not necessarily the totality of all interactions.

Gateway

Gateways broker the inbound and outbound data streams between two discreet systems. They may refuse requests and prevent them from going to another system. Also, they may modify requests to and responses from outbound systems, thus changing the contents of the data shared between systems. The metaphor of a gateway serves well because it is a portal that divides two separate spaces.

LLM (Large Language Model)

Large language models are engines that allow for the generation of natural language, language processing, classification, and other general purpose tasks. Generally, they are interfaced with using natural human language. Notable examples of LLMs are ChatGPT, LLaMA, and Mistral.

Policy

A policy is a configuration object that defines the conditional logic that will choose which processing profile is applied to a transaction based on runtime conditions or state.

Processor

Processors are components that a gateway interacts with in order to change the flow of data between an inbound request and an outbound response. Processors are steps (or commands) in a chain. Processors evaluate requests and responses and return a status to the gateway indicating if the requested prompt should proceed. In addition to gating flow, processors may also change the request or response data so that the next item in a processor chain has a different state. For example, an implementation could change the word “cat” to “dog” for every request. There are different categories of processors, listed below.

System Processor

The most common and generic processor. This type of processor handles most of all processing steps that are not concerned directly with scrubbing, filtering, redacting, or scanning prompts and their responses. Examples of system processors: - Logging processor - Backend router processor - Token accounting processor - Caching processor.

Detector Processor

A detector processor is a processor that specializes in detecting some property of the text provided in a prompt or response. For example, a detector may seek to discover if a given prompt contains protected intellectual property or PII (personally identifiable information).

Editor Processor

An editor processor is a processor that specializes in modifying prompts or responses. For example, an implementation of an editor processor may be a redaction processor which would search and find personally identifiable numeric sequences such as social security numbers and then transform them into an anonymized representation like XXX-XX-XXXX.

Profile

A profile is a configuration object that defines the chain of prompt input processors, the chain of response processors, and additional operating metadata such as limits. Profiles are selected by policy definition.

Prompt

Within the context of LLMs, a prompt is a textual input or a query to a LLM that will trigger a response or output. Prompts are designed to guide the model in providing relevant information, answering questions, or creating specific content based on the user’s needs.

For example, you can ask a LLM the following question (prompt):

Are house cats and leopards part of the same species?

And receive a response like the following:

No, house cats (Felis catus) and leopards (Panthera pardus) belong to different species. House cats are domesticated felines that have been selectively bred for thousands of years, while leopards are wild big cats found in various habitats across Africa, Asia, and parts of the Middle East.

Response

A response is the output that corresponds to a given prompt.

Session

In comparison to context, a session refers to the entirety of an interaction with a backend. By definition, it includes all prompts, responses, metadata, and the temporal ordering of the elements. For example, a session may include 10 transactions that are stored in the order in which they were sent to the backend.

System prompt

A predefined prompt used by an AI system to initiate a conversation or guide the user towards providing more information. System prompts are usually designed by developers to help users interact with a LLM efficiently and effectively allowing for a better user experience. Additionally, system prompts can be used to prevent abuse, limit the categories of responses, or influence the style and structure of the generated language.

Transaction

A transaction is a single request and response pair that is sent to a backend system. Transactions are the basic unit of work for a gateway.