garak.detectors.base

Base detectors.

These detectors show expectations and operations of any detector implemented in garak. garak detectors must inherit from a class in here.

class garak.detectors.base.Detector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Configurable

Base class for objects that define a way of detecting a probe hit / LLM failure

DEFAULT_PARAMS = {'skip': False}
accuracy = None
active = True
bcp47 = None
detect(attempt: Attempt) Iterable[float]

Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).

doc_uri = None
modality: dict = {'out': {'text'}}
precision = 0.0
recall = 0.0
tags = []
class garak.detectors.base.FileDetector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Detector

Detector subclass for processing attempts whose outputs are filenames for checking

detect(attempt: Attempt) Iterable[float]

Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).

valid_format = 'local filename'
class garak.detectors.base.HFDetector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Detector, HFCompatible

Detector using a Hugging Face model

DEFAULT_PARAMS = {'hf_args': {'device': 'cpu'}, 'skip': False, 'tokenizer_kwargs': {'padding': True, 'truncation': True}}
detect(attempt: Attempt) List[float]

Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).

class garak.detectors.base.StringDetector(substrings, config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Detector

Subclass of Detector using list of substrings as detection triggers

DEFAULT_PARAMS = {'matchtype': 'str', 'skip': False}
detect(attempt: Attempt, case_sensitive=False) Iterable[float]

Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).

class garak.detectors.base.TriggerListDetector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Detector

Subclass of Detector using attempt’s notes[triggers] as list of strings to match

detect(attempt: Attempt, case_sensitive=False) Iterable[float]

Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).