garak.probes.leakreplay

Data leak: Replay attack

Attacks that try to get a model to leak whether or not a given document was used in training, by having it replay an excerpt from that doc.

Any probe that gets a model to replay training documents can be placed here.

Further info:

class ClozeProbeMixin(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: object

Mixin for cloze‐style data replay (fill-in-the-blank).

active = False
doc_uri = 'https://arxiv.org/abs/2305.00118'
follow_prompt_cap = False
goal = 'replay training data by filling in the correct missing words'
lang = '*'
primary_detector = 'base.TriggerListDetector'
tags = ['avid-effect:security:S0301', 'owasp:llm10', 'owasp:llm06', 'quality:Security:ExtractionInversion', 'payload:leak:training']
tier = 2
class CompleteProbeMixin(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: object

Mixin for document completion data replay tests.

active = False
doc_uri = 'https://arxiv.org/abs/2305.00118'
follow_prompt_cap = False
goal = 'replay training data by providing document completions that match training examples'
lang = '*'
primary_detector = 'leakreplay.StartsWith'
tags = ['avid-effect:security:S0301', 'owasp:llm10', 'owasp:llm06', 'quality:Security:ExtractionInversion', 'payload:leak:training']
tier = 1
class GuardianCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, GuardianClozeFull

Lightweight version of Guardian cloze test for data leakage.

Uses a limited subset of the Guardian dataset to test for data leakage with masked entities.

class GuardianClozeFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: ClozeProbeMixin, Probe

Test for data leakage on Guardian articles with masked entities in a cloze test format.

Tests if models can fill in masked entities from Guardian articles, using the full dataset.

source_file = 'guardian_cloze.tsv'
class GuardianComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, GuardianCompleteFull

Lightweight version of Guardian completion test for data leakage.

Uses a limited subset of the Guardian dataset to test for data leakage through text completion.

class GuardianCompleteFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: CompleteProbeMixin, Probe

Test for data leakage on Guardian articles through text completion.

Tests if models can complete Guardian article excerpts from their training data, using the full dataset.

source_file = 'guardian_cloze.tsv'
class LiteratureCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, LiteratureClozeFull

Lightweight version of literature cloze test for data leakage.

Uses a limited subset of the literature dataset to test for data leakage with masked entities.

class LiteratureClozeFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: ClozeProbeMixin, Probe

Test for data leakage on literature passages with masked entities in a cloze test format.

Tests if models can fill in masked entities from literature excerpts, using the full dataset.

source_file = 'book_cloze.tsv'
class LiteratureComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, LiteratureCompleteFull

Lightweight version of literature completion test for data leakage.

Uses a limited subset of the literature dataset to test for data leakage through text completion.

class LiteratureCompleteFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: CompleteProbeMixin, Probe

Test for data leakage on literature passages through text completion.

Tests if models can complete literature excerpts from their training data, using the full dataset.

source_file = 'book_cloze.tsv'
tier: Tier = 2
class NYTCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, NYTClozeFull

Lightweight version of NYT cloze test for data leakage.

Uses a limited subset of the NYT dataset to test for data leakage with masked entities.

class NYTClozeFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: ClozeProbeMixin, Probe

Test for data leakage on New York Times articles with masked entities in a cloze test format.

Tests if models can fill in masked entities from NYT articles, using the full dataset.

source_file = 'nyt_cloze.tsv'
class NYTComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, NYTCompleteFull

Lightweight version of NYT completion test for data leakage.

Uses a limited subset of the NYT dataset to test for data leakage through text completion.

class NYTCompleteFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: CompleteProbeMixin, Probe

Test for data leakage on New York Times articles through text completion.

Tests if models can complete NYT article excerpts from their training data, using the full dataset.

source_file = 'nyt_cloze.tsv'
class NonFullMixinSource

Bases: object

Mixin for lightweight (limited) variants of probes.

active = True
follow_prompt_cap = True
class PotterCloze(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, PotterClozeFull

Lightweight version of Harry Potter cloze test for data leakage.

Uses a limited subset of the Harry Potter dataset to test for data leakage with masked entities.

class PotterClozeFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: ClozeProbeMixin, Probe

Test for data leakage on Harry Potter text with masked entities in a cloze test format.

Tests if models can fill in masked entities from Harry Potter excerpts, using the full dataset.

source_file = 'potter_cloze.tsv'
class PotterComplete(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: NonFullMixin, PotterCompleteFull

Lightweight version of Harry Potter completion test for data leakage.

Uses a limited subset of the Harry Potter dataset to test for data leakage through text completion.

class PotterCompleteFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source

Bases: CompleteProbeMixin, Probe

Test for data leakage on Harry Potter text through completion.

Tests if models can complete Harry Potter excerpts from their training data, using the full dataset.

source_file = 'potter_cloze.tsv'