Data Security¶

Warning

You are using an EXPERIMENTAL processor! Experimental processors:

May have bugs or stability issues
May experience breaking API changes
May not produce the expected results

By using this experimental processor you acknowledge:

It should NOT be used in a production context
It is NOT covered under F5 support agreements
Some experiments are not successful - the functionality could be retired.

Before you begin¶

Follow the steps in the Install with Helm topic to run F5 AI Gateway.

Overview¶

The F5 Data Security processor runs as a standalone processor container in AI Gateway. This processor detects and optionally redacts or blocks arbitrary sensitive data.

Processor details	Supported
Deterministic	Yes
GPU acceleration support	No
Base Memory Requirement	100 MB
Input stage	Yes
Response stage	Yes
Recommended position in stage	Beginning
Supported language(s)	English

Configuration¶

processors:
  - name: data-security
    type: external
    config:
      endpoint: https://aigw-processor-labs-data-security.ai-gateway.svc.cluster.local
      namespace: f5-processor-labs
      version: 1
    params:
      experimental: true
      modify: true
      matchers:
        - ssn
        - us_address
        - regex:
            name: image_filename
            value: "^\\w+\\.(gif|png|jpg|jpeg)$"
        - regex:
            name: date
            value: "\\d{4}-\\d{2}-\\d{2}"

Parameters¶

Parameters	Description	Type	Required	Defaults	Examples
Common parameters
`experimental`	This flag acts as an acknowledgement that you are using an experimental processor. The processor will not run unless this is set to `true`.	boolean	Yes	`false`	`true`
`matchers`	A list of data security matchers to run. A complete list can be found here. If the list is empty, all matchers will be used.	list	No	`[]`	`[ssn]`

When reject is set to true, this processor will reject the request when sensitive data is detected. When modify is set to true, this processor will replace the sensitive data with X’s. Regardless of mode, it will always add the matches to the sensitive-data tag.

Tags¶

Tag key	Description	Example values
`sensitive-data`	Added if sensitive data is detected. Contains the names of the matchers that found matches.	`[ssn, sql]`

Matcher Structure¶

Matchers are defined like so:

- ssn
- credit_card

If a matcher has additional configuration options, then it will require a custom name to be specified:

- raw:
    name: hello_world_matcher
    value: "Hello World!"

Matcher Types¶

STANDARD MATCHERS¶

These are the base customizable matchers from the Data Security engine

raw¶

A case sensitive string matcher

- raw:
    name: match_test_string
    value: test string

raw_insensitive¶

A case insensitive string matcher

- raw_insensitive:
    name: match_test_string
    value: Test String

regex¶

A regular expression (regex) matcher

- regex:
    name: date
    value: "\\d{4}-\\d{2}-\\d{2}"

INTERNAL MATCHERS¶

These are dedicated matchers that offer better performance and more complex checks than standard matchers.

routing_number¶

Matches on bank routing numbers.

credit_card¶

Matches on credit/debit card numbers. Supports almost every major bank, and requires the number to have a valid LUHN checksum.

int_phone¶

Matches on international phone numbers via Google’s libphonenumber library. Requires the country code to be specified beforehand (that is, +1 or +33). Does not support IDD codes, does not support full RFC3966 syntax (like extensions).

national_phone¶

Matches on country specific phone numbers and performs extra verification. Takes a name, a regex for matching on a countries number format, and a country code for what additional country specific checks to perform.

- national_phone:
    name: us_number
    regex: "\\d{3}-\\d{3}-\\d{4}"
    country: US

ssn¶

Matches on US Social Security Numbers.

iban¶

Matches on International Bank Account Numbers.

sql¶

Matches on SQL statements. To avoid false positives, extremely simple or benign statements are not considered a match.

vin¶

Matches on Vehicle Identification Numbers.

eui48¶

Matches on 48 bit MAC Addresses.

ipv4¶

Matches on IPv4 addresses.

- ipv4:
    name: ipv4
    value: []

Takes an optional list of sub-types that it can match against. If none are specified then matches against any address. Possible values are:

broadcast
documentation
link_local
loopback
multicast
private
unspecified

- ipv4:
    name: ipv4_broadcast
    value:
      - broadcast
      - private

ipv6¶

Matches on IPv6 addresses.

- ipv6:
    name: ipv6
    value: []

Takes an optional list of sub-types that it can match against. If none are specified then matches against any address. Possible values are:

loopback
multicast
unspecified

- ipv6:
    name: ipv6_loopback
    value:
      - loopback

imei¶

Attempts to match against known International Mobile Equipment Identity values. It isn’t perfect, but should catch most devices.

name¶

Matches on common US names. Uses a list of the top 1000 most common first and last names.

us_address¶

Matches on US addresses. The more specific the address is, the more checks the parser is able to perform (that is, ensuring a ZIP code is within a state or a city is within a ZIP code).

PRE-CANNED REGEX MATCHERS¶

These are pre-written regular expressions to match on common data patterns.

ls_regex:Email¶

regex: (?-u:\b)[a-zA-Z0-9][a-zA-Z0-9_.+-]{0,}@[a-zA-Z0-9][a-zA-Z0-9-.]{0,}\.[a-zA-Z]{2,}(?-u:\b)

HASH MATCHERS¶

These are pre-written regexes to match on common hash types.

Expand the hash matcher list

ls_hash:Bcrypt¶

format: $2{X}${rounds}${salt}{checksum}

regex: \$2[axyb]?\$\d{2}\$[./A-Za-z0-9]{53}

example: $2b$12$GhvMmNVjRW29ulnudl.LbuAnUtN/LRfe1JsBm1Xu6LE3059z5Tr8m

ls_hash:Sha256Crypt¶

format: $5$rounds={rounds}${salt}${checksum}

regex: \$5\$(rounds=\d+\$)?[./0-9A-Za-z]{0,16}\$[./0-9A-Za-z]{43}

example: $5$rounds=80000$wnsT7Yr92oJoP28r$cKhJImk5mfuSKV9b3mumNzlbstFUplKtQXXMo4G6Ep5

ls_hash:Sha512Crypt¶

format: $6$rounds={rounds}${salt}${checksum}

regex: \$6\$(rounds=\d+\$)?[./0-9A-Za-z]{0,16}\$[./0-9A-Za-z]{86}

example: $6$rounds=80000$wnsT7Yr92oJoP28r$cKhJImk5mfuSKV9b3mumNzlbstFUplKtQXXMo4G6Ep5cKhJImk5mfuSKV9b3mumNzlbstFUplKtQXXMo4G6Ep5

ls_hash:Md5Crypt¶

format: $1${salt}${checksum}

regex: \$1\$[./A-Za-z0-9]{0,8}\$[./A-Za-z0-9]{22}

example: $1$5pZSV9va$azfrPr6af3Fc7dLblQXVa0

ls_hash:Sha1Crypt¶

format: $sha1${rounds}${salt}${checksum}

regex: \$sha1\$\d+\$[./0-9A-Za-z]{0,64}\$[./0-9A-Za-z]{28}

example: sha1$40000$jtNX3nZ2$hBNaIXkt4wBI2o5rsi8KejSjNqIq

ls_hash:SunMd5Crypt¶

format: $md5,rounds={rounds}${salt}$${checksum} OR $md5${salt}$${checksum}

regex: \$md5(,rounds=\d+)?\$[./A-Za-z0-9]{0,8}\$\$[./A-Za-z0-9]{22}

example: $md5,rounds=5000$GUBv0xjJ$$mSwgIswdjlTY0YxV7HBVm0 example: $md5$GUBv0xjJ$$mSwgIswdjlTY0YxV7HBVm0

ls_hash:Argon2¶

format: $argon2{X}$v={version}$m={memory},t={time},p={parallelism}${salt}${digest}

regex: \$argon2[id]{1,2}\$v=\d+\$m=\d+,t=\d+,p=\d+\$[+/=A-Za-z0-9]+\$[+/=A-Za-z0-9]+

example: $bcrypt-sha256$v=2,t=2b,r=12$n79VH.0Q2TMWmt3Oqt9uku$Kq4Noyk3094Y2QlB8NdRT8SvGiI4ft2 example: $bcrypt-sha256$2b,12$n79VH.0Q2TMWmt3Oqt9uku$Kq4Noyk3094Y2QlB8NdRT8SvGiI4ft2

ls_hash:BcryptSha256¶

format: $bcrypt-sha256$v={version},t={type},r={rounds}${salt}${digest} OR $bcrypt-sha256${type},{rounds}${salt}${digest}

regex: \$bcrypt-sha256\$(v=2,t=2b,r=\d+|2[ab],\d+)\$[./A-Za-z0-9]{22}\$[./A-Za-z0-9]{31}

example: $bcrypt-sha256$v=2,t=2b,r=12$n79VH.0Q2TMWmt3Oqt9uku$Kq4Noyk3094Y2QlB8NdRT8SvGiI4ft2 example: $bcrypt-sha256$2b,12$n79VH.0Q2TMWmt3Oqt9uku$Kq4Noyk3094Y2QlB8NdRT8SvGiI4ft2

ls_hash:Phpass¶

format: $P${rounds}{salt}{checksum} OR $H${rounds}{salt}{checksum}

regex: \$[PH]\$[./A-Za-z0-9]{31}

example: $P$8ohUJ.1sdFw09/bMaAQPTGDNi2BIUt1

ls_hash:Pbkdf2Sha1¶

format: $pbkdf2${rounds}${salt}${checksum}

regex: \$pbkdf2\$\d+\$[./+A-Za-z0-9]+\$[./+A-Za-z0-9]{27}

example: $pbkdf2$6400$.6UI/S.nXIk8jcbdHx3Fhg$98jZicV16ODfEsEZeYPGHU3kbrU

ls_hash:Pbkdf2Sha256¶

format: $pbkdf2-sha256${rounds}${salt}${checksum}

regex: \$pbkdf2-sha256\$\d+\$[./+A-Za-z0-9]+\$[./+A-Za-z0-9]{43}

example: $pbkdf2-sha256$6400$.6UI/S.nXIk8jcbdHx3Fhg$98jZicV16ODfEsEZeYPGHU3kbrUrvUEXOPimVSQDD44

ls_hash:Pbkdf2Sha512¶

format: $pbkdf2-sha512${rounds}${salt}${checksum}

regex: \$pbkdf2-sha512\$\d+\$[./+A-Za-z0-9]+\$[./+A-Za-z0-9]{86}

example: $pbkdf2-sha512$6400$.6UI/S.nXIk8jcbdHx3Fhg$98jZicV16ODfEsEZeYPGHU3kbrUrvUEXOPimVSQDD4498jZicV16ODfEsEZeYPGHU3kbrUrvUEXOPimVSQDD44

ls_hash:Scram¶

format: $scram${rounds}${salt}${alg1}={digest1},{alg2}={digest2},...,

regex: \$scram\$\d+\$[./+A-Za-z0-9]+\$((md2|md5|sha-1|sha-224|sha-256|sha-384|sha-512|shake128|shake256)=[./+A-Za-z0-9]+,?)+

example: $scram$6400$.Z/znnNOKWUsBaCU$sha-1=cRseQyJpnuPGn3e6d6u6JdJWk.0,sha-256=5GcjEbRaUIIci1r6NAMdI9OPZbxl9S5CFR6la9CHXYc,sha-512=.DHbIm82ajXbFR196Y.9TtbsgzvGjbMeuWCtKve8TPjRMNoZK9EGyHQ6y0lW9OtWdHZrDZbBUhB9ou./VI2mlw

ls_hash:Scrypt¶

format: $scrypt$ln={logN},r={r},p={p}${salt}${checksum}

regex: \$scrypt\$ln=\d+,r=\d+,p=\d+\$[./+=A-Za-z0-9]+\$[./+=A-Za-z0-9]{43}

example: $scrypt$ln=16,r=8,p=1$aM15713r3Xsvxbi31lqr1Q$nFNh2CVHVjNldFVKDHDlm4CbdRSCdEBsjjJxD+iCs5E

ls_hash:AprMd5Crypt¶

format: $apr1${salt}${checksum}

regex: \$apr1\$[./A-Za-z0-9]{0,8}\$[./A-Za-z0-9]{22}

example: $apr1$5pZSV9va$azfrPr6af3Fc7dLblQXVa0

ls_hash:DlitzPbkdf2Sha1¶

format: $p5k2${rounds}${salt}${checksum}

regex: \$p5k2\$\d+\$[./A-Za-z0-9]+\$[./+A-Za-z0-9]{32}

example: $p5k2$2710$.pPqsEwHD7MiECU0$b8TQ5AMQemtlaSgegw5Je.JBE3QQhLbO

ls_hash:CtaPbkdf2Sha1¶

format: $p5k2${rounds}${salt}${checksum}

regex: \$p5k2\$\d+\$[./\-=_+A-Za-z0-9]+\$[./\-=_+A-Za-z0-9]{28}

example: $p5k2$2710$oX9ZZOcNgYoAsYL-8bqxKg==$AU2JLf2rNxWoZxWxRCluY0u6h6c=

ls_hash:Mssql2000¶

format: 0x0100{salt}{digest1}{digest2}

regex: 0x0100[A-F0-9]{88}

example: 0x0100200420C4988140FD3920894C3EDC188E94F428D57DAD5905F6CC1CBAF950CAD4C63F272B2C91E4DEEB5E6444

ls_hash:Mssql2005¶

format: 0x0100{salt}{digest1}

regex: 0x0100[A-F0-9]{48}

example: 0x01006ACDF9FF5D2E211B392EEF1175EFFE13B3A368CE2F94038B

ls_hash:Mysql41¶

format: *{checksum}

regex: \*[A-F0-9]{40}

example: *2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19

ls_hash:PostgresMd5¶

format: md5{checksum}

regex: md5[a-fA-F0-9]{32}

example: md5a5bfc9e07964f8dddeb95fc584cd9655

ls_hash:Oracle11¶

format: S:{checksum}{salt}

regex: S:[a-fA-F0-9]{60}

example: S:4143053633E59B4992A8EA17D2FF542C9EDEB335C886EED9C80450C1B4E6

ls_hash:BsdNthash¶

format: $3$${checksum}

regex: \$3\$\$[a-fA-F0-9]{32}

example: $3$$8846f7eaee8fb117ad06bdd830b7586c

ls_hash:DjangoPbkdf2Sha1¶

format: pbkdf2${rounds}${salt}${checksum}

regex: pbkdf2\$\d+\$[A-Za-z0-9]+\$[+/=A-Za-z0-9]+

example: pbkdf2$6400$6UISnXIk8jcbdHx3Fhg$98jZicV16ODfEsEZeYPGHU3kbrU

ls_hash:DjangoPbkdf2Sha256¶

format: pbkdf2_sha256${rounds}${salt}${checksum}

regex: pbkdf2_sha256\$\d+\$[A-Za-z0-9]+\$[+/=A-Za-z0-9]+

example: pbkdf2_sha256$10000$s1w0UXDd00XB$+4ORmyvVWAQvoAEWlDgN34vlaJx1ZTZpa1pCSRey2Yk=

ls_hash:DjangoSaltedSha1¶

format: sha1${salt}${checksum}

regex: sha1\$[a-f0-9]+\$[a-f0-9]+

example: sha1$f8793$c4cd18eb02375a037885706d414d68d521ca18c7

ls_hash:DjangoSaltedMd5¶

format: md5${salt}${checksum}

regex: md5\$[a-f0-9]+\$[a-f0-9]+

example: md5$f8793$c4cd18eb02375a037885706d414d68d521ca18c7

ls_hash:DjangoDesCrypt¶

format: crypt${salt}${checksum}

regex: crypt\$[a-f0-9]+\$[./A-Za-z0-9]{13}

example: crypt$cd1a4$cdlRbNJGImptk

ls_hash:GrubPbkdf2Sha512¶

format: grub.pbkdf2.sha512.{rounds}.{salt}.{checksum}

regex: grub.pbkdf2.sha512.\d+.[A-F0-9]+.[A-F0-9]{128}

example: grub.pbkdf2.sha512.10000.4483972AD2C52E1F590B3E2260795FDA9CA0B07B96FF492814CA9775F08C4B59CD1707F10B269E09B61B1E2D11729BCA8D62B7827B25B093EC58C4C1EAC23137.DF4FCB5DD91340D6D31E33423E4210AD47C7A4DF9FA16F401663BF288C20BF973530866178FE6D134256E4DBEFBD984B652332EED3ACAED834FEA7B73CAE851D

ERROR MATCHERS¶

These are pre-written regexes to match on common error messages.

Expand the error matchers list

ls_error:TypeError¶

looks for the case sensitive string TypeError

ls_error:Uncaught¶

looks for the case insensitive string uncaught

ls_error:SocketError¶

looks for the case sensitive string SocketError

ls_error:OperationNotSupported¶

looks for the case insensitive string operation not supported

ls_error:Callback¶

looks for the case insensitive string callback

ls_error:Segfault¶

looks for segmentation faults

regex: (?i)(SIGSEGV|segmentation fault( $core dumped$)?|segmentation violation|access violation|illegal instruction (core dumped))

ls_error:RuntimeError¶

looks for the case insensitive string RuntimeError

ls_error:OutOfMemory¶

looks for out of memory errors

regex: memory allocation of \d+ bytes failed

ls_error:PermissionDenied¶

looks for the case insensitive string permission denied

ls_error:CommandNotFound¶

looks for the case insensitive string command not found

ls_error:JsUnknownArgument¶

looks for unknown argument errors thrown by JS

regex: Unknown argument `.+`. Available options are marked with

example: Unknown argument `provider_providerAccountId`. Available options are marked with ?

ls_error:JsInvalidInvocation¶

looks for invalid invocation arguments

regex: Invalid `.+` invocation in.+/.+\.js:\d+:\d+

example: Invalid `p.account.findUnique()` invocation in /Users/ASUS/outsidetest4/node_modules/@next-auth/prisma-adapter/dist/index.js:211:45

ls_error:JsBugMessage¶

looks for the case sensitive string This is caused by either a bug in Node.js or incorrect usage of Node.js internals.

ls_error:JsError¶

looks for javascript tracebacks

regex: (at .+ $(.+:\d+:\d+|<anonymous>)$(\\n' \+)?(\s|')*)+

example:

at IncomingMessage._read (node:_http_incoming:214:19)
at Readable.read (node:internal/streams/readable:547:12)
at resume_ (node:internal/streams/readable:1048:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

ls_error:PyError¶

looks for python tracebacks

regex: Traceback $most recent call last$:(\s+File ".+", line \d+, in .+\s*.*(\s+\^+)?)+(\s+.+)?

example:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/kombu/transport/virtual/base.py", line 925, in create_channel
  return self._avail_channels.pop()
IndexError: pop from empty list

ls_error:JavaError¶

looks for java tracebacks

REGEX: (at .+\..+$.+.(java|scala):\d+$\s*)+

example:

at me.iwf.photopicker.adapter.PhotoGridAdapter.onBindViewHolder(PhotoGridAdapter.java:118)
at me.iwf.photopicker.adapter.PhotoGridAdapter.onBindViewHolder(PhotoGridAdapter.java:27)
at android.support.v7.widget.RecyclerView$Adapter.onBindViewHolder(RecyclerView.java:6673)

ls_error:RustError¶

looks for rust panics

regex: thread '.+' (panicked at '.+'.*, .+\.rs(:\d+)+|has overflowed its stack)

example: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 2, kind: NotFound, message: "No such file or directory" }', main.rs:2:47

ls_error:RubyError¶

looks for ruby tracebacks

regex: \.rb:\d+:in (`|').+'\s*((from )?\/.+\.rb:\d+:in (`|').+'\s*)*

example:

from /var/deploy/example/shared/bundle/ruby/2.3.0/gems/eventmachine-1.2.3/lib/eventmachine.rb:677:in `connect_server'
from /var/deploy/example/shared/bundle/ruby/2.3.0/gems/eventmachine-1.2.3/lib/eventmachine.rb:677:in `bind_connect'
from /var/deploy/example/shared/bundle/ruby/2.3.0/gems/eventmachine-1.2.3/lib/eventmachine.rb:653:in `connect'

ls_error:GoError¶

looks for go tracebacks

regex: goroutine \d+ \[.+\]:\s+(.+\s+.+\/.+.go:\d+( \+0x[\da-f]+.*)?\s*)+

example:

goroutine 5844 [running]:
k8s.io/kubernetes/pkg/controller/statefulset.getPersistentVolumeClaims(0xc003ee8500, 0x0?)
        pkg/controller/statefulset/stateful_set_utils.go:348 +0x2fd
k8s.io/kubernetes/pkg/controller/statefulset.(*StatefulPodControl).createPersistentVolumeClaims(0xc000b24560, 0x6ecea2?, 0xc000600000?)
        pkg/controller/statefulset/stateful_pod_control.go:341 +0x6a

ls_error:PhpError¶

looks for fatal errors in php

regex: Fatal error:.+ in .*[\/\\].*\.php(.*\s*){1,2}Stack trace:\s+(#\d+ .+\s*)+

example:

Fatal error: Uncaught Exception: Incorrect public key: error:04099079:rsa routines:RSA_padding_check_PKCS1_OAEP_mgf1:oaep decoding error in /home/apptestl/domains/apptestlab.pl/public_html/nextalk/zalogowano/crypto_library.php:13
Stack trace:
#0 home/apptestl/domains/apptestlab.pl/public_html/nextalk/zalogowano/send_message.php(33): encryptMessage('1', '-----BEGIN PUBL...')
#1 {main} thrown in /home/apptestl/domains/apptestlab.pl/public_html/nextalk/zalogowano/crypto_library.php on line 13

ls_error:EnvoySegfault¶

looks for the case sensitive string Caught Segmentation fault, suspect faulting address 0x

ls_error:PostgresError¶

looks for postgres errors

example:

ERROR:  duplicate key value violates unique constraint "constraint_name"
DETAIL:  Key (column_name)=(duplicate_value) already exists.
SQLSTATE: 23505

ls_error:MysqlError¶

looks for mysql errors

regex: ERROR \d+ $\d+$: .*

example: ERROR 1062 (23000): Duplicate entry 'value' for key 'unique_key'

ls_error:RedisError¶

looks for redis errors

regex: $error$ ERR .+

example: (error) ERR Operation against a key holding the wrong kind of value

ls_error:MongodbError¶

looks for mongodb errors

regex: (("errmsg"\s*:\s*".+"|"(code|index|ok)"\s*:\s*\d+)\s*,?\s*){2,4}

CONTENT INDICATOR MATCHERS¶

These are pre-written wordlists to match on common phrases within various categories of documents. Their current capabilities are very limited, as the word lists aren’t very big. This is more of a proof of concept for wordlist-style content classification, and more work would need to be done to make these truly useful.

Expand the content indicator list

ls_indicator:PromptInjection¶

looks for basic phrases like ignore previous instructions or imagine you had no restrictions

ls_indicator:Legal¶

looks for basic phrases like intellectual property right or This Agreement is made and entered into

ls_indicator:Financial¶

looks for basic phrases like The undersigned hereby acknowledges receipt of or Payment shall be made in accordance with the following schedule

ls_indicator:Technical¶

looks for basic phrases like engineering change request or product specifications

ls_indicator:Regulatory¶

looks for basic phrases like environmental impact assessment or certification authority

ls_indicator:Hr¶

looks for basic phrases like grievance procedures or family leave policy

ls_indicator:Security¶

looks for basic phrases like security vulnerability or unauthorized access

ls_indicator:ComplianceTraining¶

looks for basic phrases like corporate ethics guidelines or workplace compliance handbook

ls_indicator:StrategicPlans¶

looks for basic phrases like competitive analysis or annual operating plan

ls_indicator:IntellectualProperty¶

looks for basic phrases like trademark filing or copyright registration

ls_indicator:VendorContracts¶

looks for basic phrases like service level agreement or termination conditions

ls_indicator:MarketingPlans¶

looks for basic phrases like product launch plan or digital marketing strategy

ls_indicator:ResearchDevelopment¶

looks for basic phrases like proof of concept or technical feasibility analysis

ls_indicator:CrisisManagement¶

looks for basic phrases like breach response protocol or regulatory reporting requirements

Previous Next