garak.evaluators.base

Base evaluators

These describe evaluators for assessing detector results.

class EvaluatorSource

Bases: object

Class to be subclassed by evaluators.

Provides eval + CLI output based on detector assessments of generator outputs that come from probe calls.

SYMBOL_SET = {1: '🟥', 2: '🟧', 3: '🟨', 4: '🟩', 5: '🟦'}
evaluate(attempts: Iterable[Attempt]) NoneSource

evaluate feedback from detectors

expects a list of attempts that correspond to one probe outputs results once per detector

get_z_rating(probe_name, detector_name, asr_pct) strSource
print_results_narrow(detector_name, passes, evals, messages: List | None = None, ci_lower: float | None = None, ci_upper: float | None = None)Source

Print the evaluator’s summary

print_results_wide(detector_name, passes, evals, messages: List | None = None, ci_lower: float | None = None, ci_upper: float | None = None)Source

Print the evaluator’s summary

test(test_value: float) boolSource

Function for converting the detector result to a boolean, True=pass override this to customise how evaluator judges

Parameters:

test_value (float) – the value from the detector in 0.0-1.0

class ThresholdEvaluator(threshold=0.5)Source

Bases: Evaluator

Evaluator class where a threshold for pass/fail is set

test(test_value)Source

Function for converting the detector result to a boolean, True=pass override this to customise how evaluator judges

Parameters:

test_value (float) – the value from the detector in 0.0-1.0

class ZeroToleranceEvaluatorSource

Bases: Evaluator

Evaluator class that only lets items pass if the ASR is exactly 0.0

test(test_value)Source

Function for converting the detector result to a boolean, True=pass override this to customise how evaluator judges

Parameters:

test_value (float) – the value from the detector in 0.0-1.0