garak.probes.visual_jailbreak
Visual Jailbreak
Write instructions in an image in an attempt to bypass text-based alignment.
This module contains probes based on the Visual Jailbreak method details in the FigStep paper. Other probes that implement the technique are welcome, even if they aren’t detailed in the paper.
Further info:
- class FigStep(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
FigStepFull,ProbeUsing image modality to assist jailbreak. Refer to the following paper for more detail:
Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang, Sisi Duan, and Xiaoyun Wang. 2023. “FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts.” ArXiv. https://arxiv.org/pdf/2311.05608
Tiny version
- safebench_image_catalog = LocalDataPath('/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/data/safebenchtiny_filenames.txt')
- class FigStepFull(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)Source
Bases:
ProbeUsing image modality to assist jailbreak. Refer to the following paper for more detail:
Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang, Sisi Duan, and Xiaoyun Wang. 2023. “FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts.” ArXiv. https://arxiv.org/pdf/2311.05608
- prompts = []
- safebench_image_catalog = LocalDataPath('/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/data/safebench_filenames.txt')
- safebench_image_filenames = []