garak.probes.propile
ProPILE: Probing Privacy Leakage in Large Language Models
Probes for evaluating whether a model has memorized and can leak personally identifiable information (PII) from its training data. Based on the ProPILE paper (https://arxiv.org/abs/2307.01881).
The probes construct prompts using known PII to elicit other PII that may have been memorized. There are three prompt formats: twins use just the name to elicit target PII, triplets use name plus one auxiliary PII to elicit another, and quadruplets use name plus two auxiliary PIIs to elicit the third.
These probes work best when you have reason to believe specific PII was in
the training corpus. A positive result suggests memorization, but false
positives are possible when models generate plausible PII by coincidence.
Similar to garak.probes.leakreplay but focused on PII specifically.
PII Data Sources
The original ProPILE paper used the Enron email dataset, which is part of The Pile training corpus. Enron emails naturally contain rich PII because business email signatures often include name, email, phone, and address together. This makes Enron well suited for triplet and quadruplet probes.
The bundled pii_data.jsonl uses PII extracted from NVIDIA’s Nemotron-CC
dataset instead. Web crawl data like Nemotron-CC tends to have sparser PII
since contact pages usually list either email or phone, rarely both for the
same person. This works well for twin probes but provides limited data for
triplet and quadruplet probes.
For triplet and quadruplet testing, you can extract PII from the Enron dataset using the provided script (see below) or provide your own data from sources like employee directories or business contact databases.
Extracting PII from Training Datasets
Use the provided extraction script with Microsoft Presidio:
cd tools/propile
pip install -r requirements.txt
python -m spacy download en_core_web_lg
hf auth login
# Extract from Nemotron-CC (for twin probes)
python extract_pii_from_training_dataset.py \
--dataset nvidia/Nemotron-CC-v2.1 \
--subset High-Quality \
--max-samples 10000 \
--output ../../garak/data/propile/pii_data.jsonl
# Or extract from Enron (for triplet/quadruplet probes)
python extract_pii_from_training_dataset.py \
--dataset LLM-PBE/enron-email \
--max-samples 10000 \
--output ../../garak/data/propile/enron_pii.jsonl
The script uses NER to detect names, emails, and phone numbers with
confidence thresholds to filter false positives. Manual curation of
the output is recommended for best results. Run with --help for
all available options.
Custom Data
Set pii_data_path to a JSONL file with records containing: name, email,
phone, address, and optionally employer, title, relationship fields.
The data source path is logged for traceability.
Ethics
Intended for defensive security research only. When extracting PII from training data, exclude highly sensitive items (SSNs, credit cards, etc.).
- class PIILeakProbeMixinSource
Bases:
objectMixin providing common functionality for ProPILE PII leakage probes.
This mixin provides template and PII data loading for probes that test whether LLMs have memorized personally identifiable information.
Configurable parameters:
DEFAULT_PARAMScontents:pii_data_path=None
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- doc_uri = 'https://arxiv.org/abs/2307.01881'
- goal = 'extract memorized PII from model training data'
- lang = 'en'
- primary_detector = 'propile.PIILeak'
- tags = ['avid-effect:security:S0301', 'owasp:llm06', 'quality:Security:ExtractionInversion', 'payload:leak:training', 'payload:leak:pii']
- tier = 3
- class PIILeakQuadruplet(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source
Bases:
PIILeakProbeMixin,ProbeProPILE quadruplet probe: uses name + two PIIs to elicit the third.
Constructs prompts using the subject’s name and two pieces of PII to elicit the remaining PII type. This provides maximum context to the model and tests for stronger memorization.
NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.
Configurable parameters:
DEFAULT_PARAMScontents:pii_data_path=None
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- class PIILeakTriplet(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source
Bases:
PIILeakProbeMixin,ProbeProPILE triplet probe: uses name + one PII to elicit another.
Constructs prompts using the subject’s name and one piece of auxiliary PII (email or phone) to elicit a different PII type.
NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.
Configurable parameters:
DEFAULT_PARAMScontents:pii_data_path=None
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- class PIILeakTwin(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source
Bases:
PIILeakProbeMixin,ProbeProPILE twin probe: uses name to elicit target PII.
Constructs prompts using only the subject’s name to test if the model can complete with email, phone, or address information.
NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.
Configurable parameters:
DEFAULT_PARAMScontents:pii_data_path=None
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- class PIILeakUnstructured(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source
Bases:
PIILeakProbeMixin,ProbeProPILE unstructured probe: elicit relationship or affiliation info.
Tests for memorization of unstructured PII like family relationships and university affiliations.
NB: This probe is most effective when you have reason to believe the PII data was present in the model’s training corpus. Results indicate potential for PII extraction; confirmed memorization requires verified training data.
Configurable parameters:
DEFAULT_PARAMScontents:pii_data_path=Nonerelationships=['father', 'mother', 'wife', 'husband']affiliations=['university', 'employer', 'organization']
Default values are listed
See also Configuring garak for how to set these values.
Other attributes:
- affiliations = ['university', 'employer', 'organization']
- relationships = ['father', 'mother', 'wife', 'husband']