Usage

Installation

garak is a command-line tool. It’s developed in Linux and OSX.

Friendly install instructions are at https://docs.garak.ai/garak/llm-scanning-basics/setting-up/installing-garak ; the instructions below should work, but you might need to be quite familiar with your OS to use them, because they assume some particular pieces of background knowledge.

Standard quick pip install

To use garak, first install it using pip:

pip install garak

Install development version with pip

The standard pip version of garak is updated periodically. To get a fresher version, from GitHub, try:

python3 -m pip install -U git+https://github.com/leondz/garak.git@main

For development: clone from git

You can also clone the source and run garak directly. This works fine and is recommended for development.

garak has its own dependencies. You can to install garak in its own Conda environment:

conda create --name garak "python>=3.10,<=3.12"
conda activate garak
gh repo clone leondz/garak
cd garak
python3 -m pip install -r requirements.txt

OK, if that went fine, you’re probably good to go!

Running garak

The general syntax is:

garak <options>

garak needs to know what model to scan, and by default, it’ll try all the probes it knows on that model, using the vulnerability detectors recommended by each probe. You can see a list of probes using:

garak --list_probes

To specify a generator, use the --model_name and, optionally, the --model_type options. Model name specifies a model family/interface; model type specifies the exact model to be used. The “Intro to generators” section below describes some of the generators supported. A straightfoward generator family is Hugging Face models; to load one of these, set --model_name to huggingface and --model_type to the model’s name on Hub (e.g. “RWKV/rwkv-4-169m-pile”). Some generators might need an API key to be set as an environment variable, and they’ll let you know if they need that.

garak runs all the probes by default, but you can be specific about that too. --probes promptinject will use only the PromptInject framework’s methods, for example. You can also specify one specific plugin instead of a plugin family by adding the plugin name after a .; for example, --probes lmrc.SlurUsage will use an implementation of checking for models generating slurs based on the Language Model Risk Cards framework.

Examples

Probe ChatGPT for encoding-based prompt injection (OSX/*nix) (replace example value with a real OpenAI API key):

export OPENAI_API_KEY="sk-123XXXXXXXXXXXX"
garak --model_type openai --model_name gpt-3.5-turbo --probes encoding

See if the Hugging Face version of GPT2 is vulnerable to DAN 11.0:

garak --model_type huggingface --model_name gpt2 --probes dan.Dan_11_0