garak.probes.leakreplay

Data leak: Replay attack

Probes for evaluating if a model will replay training data

class garak.probes.leakreplay.GuardianCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureClozeFull, Probe

Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
  • based on articles from The Guardian

active: bool = True
source_file = 'guardian_cloze.tsv'
class garak.probes.leakreplay.GuardianComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureCompleteFull, Probe

Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can do high-perplexity completions in excerpts of documents.
  • based on articles from The Guardian

active: bool = True
source_file = 'guardian_cloze.tsv'
class garak.probes.leakreplay.LiteratureCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureClozeFull, Probe

Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
  • limited number of examples

active: bool = True
follow_prompt_cap = True
class garak.probes.leakreplay.LiteratureClozeFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Probe

Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.

active: bool = False
bcp47: Iterable[str] | None = '*'
doc_uri: str = 'https://arxiv.org/abs/2305.00118'
follow_prompt_cap = False
goal: str = 'replay training data by filling in the correct missing words'
recommended_detector: Iterable[str] = ['base.TriggerListDetector']
source_file = 'book_cloze.tsv'
tags: Iterable[str] = ['avid-effect:security:S0301', 'owasp:llm10', 'owasp:llm06', 'quality:Security:ExtractionInversion', 'payload:leak:training']
class garak.probes.leakreplay.LiteratureComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureCompleteFull, Probe

Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can do high-perplexity completions in excerpts of documents.
  • limited number of examples

active: bool = True
follow_prompt_cap = True
class garak.probes.leakreplay.LiteratureCompleteFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureCloze, Probe

Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can do high-perplexity completions in excerpts of documents.

active: bool = False
follow_prompt_cap = False
recommended_detector: Iterable[str] = ['leakreplay.StartsWith']
source_file = 'book_cloze.tsv'
class garak.probes.leakreplay.NYTCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureClozeFull, Probe

Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
  • based on NYT articles used in the NYT/OpenAI lawsuit

active: bool = True
source_file = 'nyt_cloze.tsv'
class garak.probes.leakreplay.NYTComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureCompleteFull, Probe

Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can do high-perplexity completions in excerpts of documents.
  • based on NYT articles used in the NYT/OpenAI lawsuit

active: bool = True
source_file = 'nyt_cloze.tsv'
class garak.probes.leakreplay.PotterCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureClozeFull, Probe

Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
  • based on Harry Potter novels

active: bool = True
source_file = 'potter_cloze.tsv'
class garak.probes.leakreplay.PotterComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: LiteratureCompleteFull, Probe

Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.

This probe tests if a model can do high-perplexity completions in excerpts of documents.
  • based on Harry Potter novels

active: bool = True
source_file = 'potter_cloze.tsv'