Language identification

Before you begin

Follow the steps in the Install with Helm topic to run F5 AI Gateway.

Overview

The F5 language identification processor runs in the AI Gateway processors container and predicts with confidence the language(s) of a given prompt or response using a pre-trained classification model. The output is a two-letter language code that follows the ISO 639-1 standard. The language identification processor can also detect programming code in text using known code patterns and indicators.

Processor details

Supported

Deterministic

No

GPU acceleration support

Yes

Base Memory Requirement

1.12 GB

Input stage

Yes

Response stage

Yes

Recommended position in stage

Beginning

Supported language(s)

See supported languages

Configuration

processors:
  - name: language-id
    type: external
    config:
      endpoint: https://aigw-processors-f5.ai-gateway.svc.cluster.local
      namespace: f5
      version: 1
    params:
      code_detect: false
      threshold: 0.5
      reject: false
      allowed_languages: []

Parameters

Description

Type

Required

Defaults

Examples

Common parameters

code_detect

Detect programming code in text using known code patterns and indicators.
Uses the code tag when code is detected.

bool

No

false

true, false

threshold

Confidence threshold value to detect language, anything below this value is returned as unknown with confidence value of 0.0. Set to 0.0 to disable threshold. Note: unknown predictions are still possible with threshold disabled.

float
0.0 to 1.0

No

0.5

0.42

allowed_languages

List of languages that are allowed for the processor to proceed with the request. All detected languages must be in this list. When not set, all languages are allowed.

list[str]

No

[]

["en", "fr"]

Note

The reject parameter must be set to true to use the allowed_languages parameter.

Tags

The detected languages are added to the processor response tags in the ISO 639-1 format. If the processor detects programming code, it adds code to the tags. If the processor is unable to determine the language, or the confidence value is below the threshold value, it adds unknown to the tags.

Tag key

Description

Example values

language

Any languages detected by the processor

["en", "code"]

Supported languages

The language identification processor comes with support for the languages listed in the table below. English "en" encompasses both British and American English. Chinese "zh" is Simplified Chinese.

Language

Code

Language

Code

Arabic

ar

Japanese

ja

Bulgarian

bg

Polish

pl

Chinese

zh

Portuguese

pt

Dutch

nl

Russian

ru

English

en

Spanish

es

French

fr

Swahili

sw

German

de

Thai

th

Greek

el

Turkish

tr

Hindi

hi

Urdu

ur

Italian

it

Vietnamese

vi

Accuracy

The number of tokens in the input has a direct impact on the accuracy of the prediction. The model depends on contextual clues from neighboring words and sentence structure in order to capture the semantic relationships which aid classification.

Note

Tokens are the smallest units of text a machine learning model processes. While they often match entire words, models may split words into multiple tokens (e.g., subwords or characters) for better handling of rare or complex terms.

Token count

Approximate accuracy

1-5

76%

6-10

96%

11+

99%

Code Detection

Code detection is deterministic in the language identification processor. It uses a set of regular expression patterns and keywords to detect code in text but is not an exhaustive list of patterns and keywords which encompass all programming languages. The processor might not detect 100% of the code 100% of the time.

Chunking input and batch processing

The language identification processor will split inputs and responses into chunks and perform inference on these chunks in batches.

Note

Always perform empirical tests on hardware with real or representative data. Profiling is the best way to see how changing chunk and/or batch sizes impacts performance.

Chunking input

Chunk size indicates how much data from a single input is fed to the model at once. It’s driven by the underlying model constraint on maximum sequence length and task needs for context. It directly impacts memory usage per inference call and can affect latency if chunks are too large.

The maximum sequence length for the language identification processor is 512 tokens, so chunk size should not exceed 512. The lowest possible value is 1, but this would result in the underlying model being unable to reliably perform classification on input. The default chunk size in tokens is 128, this value should not be lower than 32. It is possible to override this value by setting the environment variable LANGUAGE_ID_PROCESSOR_CHUNK_SIZE: 256 in processors.f5.env.

Batch processing

Batch size determines how many separate inputs (or chunks) are processed simultaneously. Larger batch sizes can improve performance by taking advantage of parallel processing but can also saturate the GPU. The default batch size is 16. There is no upper limit on this but it must be a value greater than or equal to 1. It is possible to override this value by setting environment variables LANGUAGE_ID_PROCESSOR_BATCH_SIZE: 32 in processors.f5.env.