garak.generators.huggingface

Hugging Face generator

Supports pipelines, inference API, and models.

Not all models on HF Hub work well with pipelines; try a Model generator if there are problems. Otherwise, please let us know if it’s still not working!

https://github.com/NVIDIA/garak/issues

If you use the inference API, it’s recommended to put your Hugging Face API key in an environment variable called HF_INFERENCE_TOKEN , else the rate limiting can be quite strong. Find your Hugging Face Inference API Key here:

https://huggingface.co/docs/api-inference/quicktour

exception HFInternalServerErrorSource : Bases: GarakException

exception HFLoadingExceptionSource : Bases: GarakException

exception HFRateLimitExceptionSource : Bases: GarakException

class InferenceAPI(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: Generator

Get text generations from Hugging Face Inference API

Configurable parameters:

DEFAULT_PARAMS contents:

max_tokens = 150
temperature = None
top_k = None
context_len = None
skip_seq_start = None
skip_seq_end = None
deprefix_prompt = True
max_time = 20
wait_for_model = False

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

ENV_VAR = 'HF_INFERENCE_TOKEN'

URI = 'https://api-inference.huggingface.co/models/'

generator_family_name = 'Hugging Face 🤗 Inference API'

requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/stable/lib/python3.12/site-packages/requests/__init__.py'>

supports_multiple_generations = True

class InferenceEndpoint(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: InferenceAPI

Interface for Hugging Face private endpoints

Pass the model URL as the name, e.g. https://xxx.aws.endpoints.huggingface.cloud

requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/stable/lib/python3.12/site-packages/requests/__init__.py'>

supports_multiple_generations = False

timeout = 120

class LLaVA(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: Generator, HFCompatible

Get LLaVA ([ text + image ] -> text) generations

NB. This should be use with strict modality matching - generate() doesn’t support text-only prompts.

Configurable parameters:

DEFAULT_PARAMS contents:

max_tokens = 4000
temperature = None
top_k = None
context_len = None
skip_seq_start = None
skip_seq_end = None
hf_args = {'torch_dtype': 'float16', 'low_cpu_mem_usage': True, 'device_map': 'auto'}

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

extra_dependency_names = ['pillow']

generate(prompt: Conversation, generations_this_call: int = 1, typecheck=True) → List[Message | None]Source 

Manages the process of getting generations out from a prompt

This will involve iterating through prompts, getting the generations from the model via a _call_* function, and returning the output

Avoid overriding this - try to override _call_model or _call_api

modality: dict = {'in': {'image', 'text'}, 'out': {'text'}}

parallel_capable = False

supported_models = ['llava-hf/llava-v1.6-34b-hf', 'llava-hf/llava-v1.6-vicuna-13b-hf', 'llava-hf/llava-v1.6-vicuna-7b-hf', 'llava-hf/llava-v1.6-mistral-7b-hf']

class Model(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: Pipeline, HFCompatible

Get text generations from a locally-run Hugging Face model

generator_family_name = 'Hugging Face 🤗 model'

supports_multiple_generations = True

class Pipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source 

Bases: Generator, HFCompatible

Get text generations from a locally-run Hugging Face pipeline

Configurable parameters:

DEFAULT_PARAMS contents:

max_tokens = 150
temperature = None
top_k = None
context_len = None
skip_seq_start = None
skip_seq_end = None
hf_args = {'torch_dtype': 'float16', 'do_sample': True, 'device': None}

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

generator_family_name = 'Hugging Face 🤗 pipeline'

parallel_capable = False

supports_multiple_generations = True