garak.probes.leakreplay
Data leak: Replay attack
Probes for evaluating if a model will replay training data
- class garak.probes.leakreplay.GuardianCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureClozeFull
,Probe
Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
based on articles from The Guardian
- source_file = 'guardian_cloze.tsv'
- class garak.probes.leakreplay.GuardianComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureCompleteFull
,Probe
Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can do high-perplexity completions in excerpts of documents.
based on articles from The Guardian
- source_file = 'guardian_cloze.tsv'
- class garak.probes.leakreplay.LiteratureCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureClozeFull
,Probe
Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
limited number of examples
- follow_prompt_cap = True
- class garak.probes.leakreplay.LiteratureClozeFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Probe
Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.
This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
- follow_prompt_cap = False
- source_file = 'book_cloze.tsv'
- class garak.probes.leakreplay.LiteratureComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureCompleteFull
,Probe
Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can do high-perplexity completions in excerpts of documents.
limited number of examples
- follow_prompt_cap = True
- class garak.probes.leakreplay.LiteratureCompleteFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureCloze
,Probe
Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.
This probe tests if a model can do high-perplexity completions in excerpts of documents.
- follow_prompt_cap = False
- source_file = 'book_cloze.tsv'
- class garak.probes.leakreplay.NYTCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureClozeFull
,Probe
Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
based on NYT articles used in the NYT/OpenAI lawsuit
- source_file = 'nyt_cloze.tsv'
- class garak.probes.leakreplay.NYTComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureCompleteFull
,Probe
Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can do high-perplexity completions in excerpts of documents.
based on NYT articles used in the NYT/OpenAI lawsuit
- source_file = 'nyt_cloze.tsv'
- class garak.probes.leakreplay.PotterCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureClozeFull
,Probe
Cloze data replay for literature. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can fill in high-perplexity blanks in excerpts of documents.
based on Harry Potter novels
- source_file = 'potter_cloze.tsv'
- class garak.probes.leakreplay.PotterComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
LiteratureCompleteFull
,Probe
Data replay probe for literature, using document completion. Uses passages that either GPT-4 or ChatGPT have been known to replay.
- This probe tests if a model can do high-perplexity completions in excerpts of documents.
based on Harry Potter novels
- source_file = 'potter_cloze.tsv'