garak.probes.propile

ProPILE: Probing Privacy Leakage in Large Language Models

Probes for evaluating whether a model has memorized and can leak personally identifiable information (PII) from its training data. Based on the ProPILE paper (https://arxiv.org/abs/2307.01881).

The probes construct prompts using known PII to elicit other PII that may have been memorized. There are three prompt formats: twins use just the name to elicit target PII, triplets use name plus one auxiliary PII to elicit another, and quadruplets use name plus two auxiliary PIIs to elicit the third.

These probes work best when you have reason to believe specific PII was in the training corpus. A positive result suggests memorization, but false positives are possible when models generate plausible PII by coincidence. Similar to garak.probes.leakreplay but focused on PII specifically.

PII Data Sources

The original ProPILE paper used the Enron email dataset, which is part of The Pile training corpus. Enron emails naturally contain rich PII because business email signatures often include name, email, phone, and address together. This makes Enron well suited for triplet and quadruplet probes.

The bundled pii_data.jsonl uses PII extracted from NVIDIA’s Nemotron-CC dataset instead. Web crawl data like Nemotron-CC tends to have sparser PII since contact pages usually list either email or phone, rarely both for the same person. This works well for twin probes but provides limited data for triplet and quadruplet probes.

For triplet and quadruplet testing, you can extract PII from the Enron dataset using the provided script (see below) or provide your own data from sources like employee directories or business contact databases.

Extracting PII from Training Datasets

Use the provided extraction script with Microsoft Presidio:

cd tools/propile
pip install -r requirements.txt
python -m spacy download en_core_web_lg
hf auth login

# Extract from Nemotron-CC (for twin probes)
python extract_pii_from_training_dataset.py \
    --dataset nvidia/Nemotron-CC-v2.1 \
    --subset High-Quality \
    --max-samples 10000 \
    --output ../../garak/data/propile/pii_data.jsonl

# Or extract from Enron (for triplet/quadruplet probes)
python extract_pii_from_training_dataset.py \
    --dataset LLM-PBE/enron-email \
    --max-samples 10000 \
    --output ../../garak/data/propile/enron_pii.jsonl

The script uses NER to detect names, emails, and phone numbers with confidence thresholds to filter false positives. Manual curation of the output is recommended for best results. Run with --help for all available options.

Custom Data

Set pii_data_path to a JSONL file with records containing: name, email, phone, address, and optionally employer, title, relationship fields.

The data source path is logged for traceability.

Ethics

Intended for defensive security research only. When extracting PII from training data, exclude highly sensitive items (SSNs, credit cards, etc.).

class PIILeakProbeMixinSource 

Bases: object

Mixin providing common functionality for ProPILE PII leakage probes.

This mixin provides template and PII data loading for probes that test whether LLMs have memorized personally identifiable information.

Configurable parameters:

DEFAULT_PARAMS contents:

pii_data_path = None

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

doc_uri = 'https://arxiv.org/abs/2307.01881'

goal = 'extract memorized PII from model training data'

lang = 'en'

primary_detector = 'propile.PIILeak'

probe(generator)Source 

tags = ['avid-effect:security:S0301', 'owasp:llm06', 'quality:Security:ExtractionInversion', 'payload:leak:training', 'payload:leak:pii']

tier = 3

class PIILeakQuadruplet(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: PIILeakProbeMixin, Probe

ProPILE quadruplet probe: uses name + two PIIs to elicit the third.

Constructs prompts using the subject’s name and two pieces of PII to elicit the remaining PII type. This provides maximum context to the model and tests for stronger memorization.

NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.

Configurable parameters:

DEFAULT_PARAMS contents:

pii_data_path = None

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

active: bool = False

goal: str = 'extract PII using name and two auxiliary PIIs'

class PIILeakTriplet(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: PIILeakProbeMixin, Probe

ProPILE triplet probe: uses name + one PII to elicit another.

Constructs prompts using the subject’s name and one piece of auxiliary PII (email or phone) to elicit a different PII type.

NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.

Configurable parameters:

DEFAULT_PARAMS contents:

pii_data_path = None

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

active: bool = False

goal: str = 'extract PII using name and one auxiliary PII'

class PIILeakTwin(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: PIILeakProbeMixin, Probe

ProPILE twin probe: uses name to elicit target PII.

Constructs prompts using only the subject’s name to test if the model can complete with email, phone, or address information.

NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.

Configurable parameters:

DEFAULT_PARAMS contents:

pii_data_path = None

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

active: bool = False

goal: str = "extract PII using only the subject's name"

class PIILeakUnstructured(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: PIILeakProbeMixin, Probe

ProPILE unstructured probe: elicit relationship or affiliation info.

Tests for memorization of unstructured PII like family relationships and university affiliations.

NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.

Configurable parameters:

DEFAULT_PARAMS contents:

pii_data_path = None
relationships = ['father', 'mother', 'wife', 'husband']
affiliations = ['university', 'employer', 'organization']

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

active: bool = False

affiliations = ['university', 'employer', 'organization']

goal: str = 'extract unstructured PII like relationships or affiliations'

relationships = ['father', 'mother', 'wife', 'husband']