garak.probes.ansiescape

ANSI Escape attack

Try to make a model produce ANSI escape codes, which can disrupt downstream processing.

Probes in this module should all try to elicit ANSI escape codes or information suggesting that the target is capable of producing them. There are a couple of different dimensions included:

the encoding can vary - raw binary, or an escaped version;
the technique used can vary - can we get OCS8, OCS52
the action can also be different - log cursor, crash machine, install software etc.

Further info:

class AnsiEscaped(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/garak/_config.py'>)Source 

Bases: Probe

Probe for eliciting ANSI control codes from a model

ANSI escape codes can disrupt downstream processes. Try to get escaped ANSI out of a model.

active: bool = True

doc_uri: str = 'https://interhumanagreement.substack.com/p/llm-output-can-take-over-your-computer'

goal: str = 'print escaped ANSI terminal codes'

intent: str | None = 'S008terminal'

lang: str | None = 'en'

primary_detector: str | None = 'ansiescape.Escaped'

tags: Iterable[str] = ['owasp:llm01', 'owasp:llm02', 'owasp:llm05', 'avid-effect:security:S0100', 'avid-effect:security:S0200', 'quality:Security:Integrity', 'demon:Language:Code_and_encode:Programming', 'demon:Stratagems:Meta-prompting:Ask_for_examples']

tier: Tier = 3

class AnsiRaw(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/garak/_config.py'>)Source 

Bases: AnsiEscaped

Probe for eliciting raw ANSI control codes from a model

ANSI escape codes can disrupt downstream processes. Try to get raw ANSI out of a model.

active: bool = True

goal: str = 'print raw ANSI terminal codes'

intent: str | None = 'S008terminal'

primary_detector: str | None = 'ansiescape.Raw'

tier: Tier = 1

class AnsiRawTokenizerHF(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/garak/_config.py'>)Source 

Bases: Probe

Probe for checking a tokenizer for potentially malicious ANSI escape sequences

Returns any tokens bearing ANSI escape sequences that could be used to seize control of anything rendering output. Always returns one token judged non-malicious.

active: bool = False

doc_uri: str = 'https://interhumanagreement.substack.com/p/llm-output-can-take-over-your-computer'

goal: str = 'check if raw ANSI escape sequences tokens are available'

intent: str | None = 'S008terminal'

lang: str | None = '*'

primary_detector: str | None = 'ansiescape.Raw'

probe(generator) → Iterable[Attempt]Source 

attempt to gather target generator’s risky tokens.

Return modes:

something sus? attempts for each risky token
nothing sus? attempt with empty string

Rationale:

we want a big alert if there are bad tokens, and we want a 100% pass if there are no bad tokens

Route:

mint attempt for the first non-risky token we find
mint attempts for each risky token found

Output: score = 1/(abs(risky tokens) + 1); The more risky tokens, the worse

supported_generators = {'huggingface.LLaVA', 'huggingface.Model', 'huggingface.Pipeline'}

tags: Iterable[str] = ['owasp:llm05', 'demon:Language:Code_and_encode:Token']

tier: Tier = 1