Topic Detect¶

Warning

You are using an EXPERIMENTAL processor! Experimental processors:

May have bugs or stability issues
May experience breaking API changes
May not produce the expected results

By using this experimental processor you acknowledge:

It should NOT be used in a production context
It is NOT covered under F5 support agreements
Some experiments are not successful - the functionality could be retired.

Before you begin¶

You must have a working deployment of F5 AI Gateway. Follow the steps in the Install with Helm topic to get started.

The Topic Detect processor is not part of the default installation; to enable it, add the following section to your config file:

processorLabs:
  topicDetect:
    enable: true

Overview¶

The Topic Detect processor runs as a standalone processor container in AI Gateway. This processor uses a zero shot classification model to determine the conversation topic.

Processor details	Supported
Deterministic	No
GPU acceleration support	Yes
Base Memory Requirement	3.5 GB
Input stage	Yes
Response stage	Yes
Recommended position in stage	Beginning
Supported language(s)	English.

Required processor order¶

The Topic Detect processor only supports English language prompts; topic detection in any other language will not be detected. The F5 Language Identification processor must be configured to run in a stage before the Topic Detect processor and configured with reject: true and allowed_languages: [“en”]. This will ensure that only prompts detected in the english will be allowed to proceed in the processor pipeline to the Topic Detect processor, before being sent to a configured Service.

Configuration¶

processors:
  - name: topic-detect
    type: external
    config:
      endpoint: https://aigw-processors-f5.ai-gateway.svc.cluster.local
      namespace: f5-processor-labs
      version: 1
    params:
      experimental: true
      reject: true
      threshold: 0.95
      allowed_topics: ["cats", "food"]
      rejected_topics: ["dogs", "fish"]

Parameters¶

Parameters	Description	Type	Required	Defaults	Examples
Common parameters
`experimental`	Flag to acknowledge that you are using an experimental processor. The processor will not run unless this is set to `true`.	boolean	Yes	`false`	`true`
`threshold`	Minimum confidence score required to consider a topic as detected. Higher values will make the processor more strict.	float `0.0` to `1.0`	No	`0.5`	`0.5`
`allowed_topics`	A list of allowed topic strings that the processor will use to assign the most relevant topic through zero-shot classification.	list[str]	No	[]	[‘sports’]
`rejected_topics`	A list of disallowed topic strings that the processor will use to assign the most relevant topic through zero-shot classification.	list[str]	No	[]	[‘music’]

You must provide at least one item either in the allowed_topics list or the rejected_topics list. The processor will not work without at least one allowed or rejected topic.

It is not possible to set the same value in both allowed_topics and rejected_topics.

When reject is set to true, this processor will reject the request:

If a topic is detected in the rejected_topics or
If no topic in the allowed_topics list is detected it will add to the detected_topics tag.

Note

If you are configuring allowed_topics, we recommend to also add the value [“greeting”] to the allowed topics list, for the cases where people initiate the conversation with “hello” or other greetings.

Tags¶

The detected topics are added to the processor response tags. If the processor is unable to determine the topic, or the confidence value is below the threshold value, it adds unknown to the tags.

Tag key	Description	Example values
`topics`	Any topics detected by the processor.	`{'topics': ['sports', 'animals']}`

Chunking input and batch processing¶

The Topic Detect processor will split inputs and responses into overlapping chunks and perform inference on these chunks in batches. Chunks are designed to overlap so that context is preserved across boundaries; this ensures that if a topic detection occurs at the edge of one chunk, the overlapping region with the next chunk will still capture it.

Note

Always perform empirical tests on hardware with real or representative data. Profiling is the best way to see how changing chunk and/or batch sizes impacts performance.

Chunking input¶

Chunk size controls how much of the input is processed at a time. It’s based on the model’s maximum input limit and how much context the task needs. Larger chunks use more memory and may slow things down, while smaller chunks can be faster but might miss important context.

The Topic Detect processor splits its input into chunks of a variable number of tokens, between 32 and 512 ( default: 512). The number of tokens is configurable by setting TOPIC_DETECT_PROCESSOR_CHUNK_SIZE in the processors.f5.env section of the AI Gateway Helm chart.

The Topic Detect processor implements a sliding window (overlap) for chunking input. A sliding window refers to the practice of dividing longer text into overlapping chunks so that a model can capture context that spans chunk boundaries. During inference, each chunk is fed separately into the classification model. Because each chunk is passed through the model (a forward pass), the process can increase memory usage as more chunks are generated and processed. Too much overlap can lead to repeated processing of the same tokens, which might not improve prediction efficacy and could even introduce redundancy in the predictions. Decreased overlap reduces redundancy in the processed data but with little or no overlap, the model might miss contextual cues that lie near the chunk boundaries, potentially reducing prediction consistency across segments.

The default chunk overlap size in tokens is half the value of the chunk size setting; to disable overlapping set the environment variable TOPIC_DETECT_PROCESSOR_CHUNK_OVERLAP: 0. This value must not be set to a value larger than chunk_size - 1.

Previous Next