Language identification¶
Before you begin¶
Follow the steps in the Install with Helm topic to run F5 AI Gateway.
Overview¶
The F5 language identification processor runs in the AI Gateway processors container and predicts with confidence the language(s) of a given prompt or response using a pre-trained classification model. The output is a two-letter language code that follows the ISO 639-1 standard. The language identification processor can also detect programming code in text using known code patterns and indicators.
Processor details |
Supported |
---|---|
No |
|
Yes |
|
Base Memory Requirement |
1.12 GB |
Input stage |
Yes |
Response stage |
Yes |
Beginning |
|
Supported language(s) |
Configuration¶
processors:
- name: language-id
type: external
config:
endpoint: https://aigw-processors-f5.ai-gateway.svc.cluster.local
namespace: f5
version: 1
params:
code_detect: false
threshold: 0.5
reject: false
allowed_languages: []
Parameters |
Description |
Type |
Required |
Defaults |
Examples |
---|---|---|---|---|---|
|
Detect programming code in text using known code patterns and indicators. |
bool |
No |
|
|
|
Confidence threshold value to detect language, anything below this value is returned as |
float |
No |
|
|
|
List of languages that are allowed for the processor to proceed with the request. All detected languages must be in this list. When not set, all languages are allowed. |
list[str] |
No |
|
|
Note
The reject
parameter must be set to true
to use the allowed_languages
parameter.
Supported languages¶
The language identification processor comes with support for the languages
listed in the table below. English "en"
encompasses both British and American
English. Chinese "zh"
is Simplified Chinese.
Language |
Code |
Language |
Code |
---|---|---|---|
Arabic |
ar |
Japanese |
ja |
Bulgarian |
bg |
Polish |
pl |
Chinese |
zh |
Portuguese |
pt |
Dutch |
nl |
Russian |
ru |
English |
en |
Spanish |
es |
French |
fr |
Swahili |
sw |
German |
de |
Thai |
th |
Greek |
el |
Turkish |
tr |
Hindi |
hi |
Urdu |
ur |
Italian |
it |
Vietnamese |
vi |
Accuracy¶
The number of tokens in the input has a direct impact on the accuracy of the prediction. The model depends on contextual clues from neighboring words and sentence structure in order to capture the semantic relationships which aid classification.
Note
Tokens are the smallest units of text a machine learning model processes. While they often match entire words, models may split words into multiple tokens (e.g., subwords or characters) for better handling of rare or complex terms.
Token count |
Approximate accuracy |
---|---|
1-5 |
76% |
6-10 |
96% |
11+ |
99% |
Code Detection¶
Code detection is deterministic in the language identification processor. It uses a set of regular expression patterns and keywords to detect code in text but is not an exhaustive list of patterns and keywords which encompass all programming languages. The processor might not detect 100% of the code 100% of the time.
Chunking input and batch processing¶
The language identification processor will split inputs and responses into chunks and perform inference on these chunks in batches.
Note
Always perform empirical tests on hardware with real or representative data. Profiling is the best way to see how changing chunk and/or batch sizes impacts performance.
Chunking input¶
Chunk size indicates how much data from a single input is fed to the model at once. It’s driven by the underlying model constraint on maximum sequence length and task needs for context. It directly impacts memory usage per inference call and can affect latency if chunks are too large.
The maximum sequence length for the language identification processor is 512
tokens,
so chunk size should not exceed 512
. The lowest possible value is 1
, but this
would result in the underlying model being unable to reliably perform classification on
input. The default chunk size in tokens is 128
, this value should not be lower than 32
.
It is possible to override this value by setting the environment variable
LANGUAGE_ID_PROCESSOR_CHUNK_SIZE: 256
in processors.f5.env
.
Batch processing¶
Batch size determines how many separate inputs (or chunks) are processed simultaneously.
Larger batch sizes can improve performance by taking advantage of parallel processing but can also
saturate the GPU. The default batch size is 16
. There is no upper limit on this but
it must be a value greater than or equal to 1
. It is possible to override this
value by setting environment variables LANGUAGE_ID_PROCESSOR_BATCH_SIZE: 32
in processors.f5.env
.