garak.detectors.base
This class defines the basic structure of garak’s detectors. All detectors inherit from garak.detectors.base.Detector.
Attributes
1. doc_uri URI for documentation of the detector (perhaps a paper)
1. lang_spec Language this is for. format: a comma-separated list of BCP47 tags, or “*” for any or not applicable. Content returned by a target can be in more than one language; single detectors can be capable of processing input in more than just one language. This field tracks which ones are supported. NB this is different from probe, which is monolingual and uses lang.
1. active Should this detector be used by default?
1. tags MISP-format taxonomy categories
1. modality Which modalities does this detector work on? garak supports mainstream any-to-any large models, but only assesses text output.
Base detectors.
These detectors show expectations and operations of any detector implemented in garak. garak detectors must inherit from a class in here.
- class Detector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
ConfigurableBase class for objects that define a way of detecting a probe hit / LLM failure
Configurable parameters:
DEFAULT_PARAMScontents:skip=False
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- detect(attempt: Attempt) Iterable[float | None]Source
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- extra_dependency_names = []
- hit_desc = 'Detector identified matching indicators of risk'
- pass_desc = 'Detector found no issue'
- tags = []
- class FileDetector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
DetectorDetector subclass for processing attempts whose outputs are filenames for checking
- detect(attempt: Attempt) Iterable[float | None]Source
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- valid_format = 'local filename'
- class HFDetector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
Detector,HFCompatibleDetector using a Hugging Face model
Configurable parameters:
DEFAULT_PARAMScontents:skip=Falsehf_args={'device': 'cpu'}tokenizer_kwargs={'padding': True, 'truncation': True}graceful_fail=True
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- class StringDetector(substrings, config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
DetectorSubclass of Detector using list of substrings as detection triggers
Configurable parameters:
DEFAULT_PARAMScontents:skip=Falsematchtype='str'case_sensitive=False
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- class TriggerListDetector(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
DetectorSubclass of Detector using attempt’s notes[triggers] as list of strings to match
Configurable parameters:
DEFAULT_PARAMScontents:skip=Falsecase_sensitive=False
Default values are listed
See also Configuring garak for how to set these values.
Other attributes: