garak.generators.huggingface

Hugging Face generator

Supports pipelines, inference API, and models.

Not all models on HF Hub work well with pipelines; try a Model generator if there are problems. Otherwise, please let us know if it’s still not working!

If you use the inference API, it’s recommended to put your Hugging Face API key in an environment variable called HF_INFERENCE_TOKEN , else the rate limiting can be quite strong. Find your Hugging Face Inference API Key here:

class garak.generators.huggingface.ConversationalPipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Pipeline, HFCompatible

Conversational text generation using HuggingFace pipelines

clear_history()
generator_family_name = 'Hugging Face 🤗 pipeline for conversations'
supports_multiple_generations = True
exception garak.generators.huggingface.HFInternalServerError

Bases: GarakException

exception garak.generators.huggingface.HFLoadingException

Bases: GarakException

exception garak.generators.huggingface.HFRateLimitException

Bases: GarakException

class garak.generators.huggingface.InferenceAPI(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Generator

Get text generations from Hugging Face Inference API

DEFAULT_PARAMS = {'context_len': None, 'deprefix_prompt': True, 'max_time': 20, 'max_tokens': 150, 'temperature': None, 'top_k': None, 'wait_for_model': False}
ENV_VAR = 'HF_INFERENCE_TOKEN'
URI = 'https://api-inference.huggingface.co/models/'
generator_family_name = 'Hugging Face 🤗 Inference API'
requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/latest/lib/python3.12/site-packages/requests/__init__.py'>
supports_multiple_generations = True
class garak.generators.huggingface.InferenceEndpoint(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: InferenceAPI

Interface for Hugging Face private endpoints

Pass the model URL as the name, e.g. https://xxx.aws.endpoints.huggingface.cloud

requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/latest/lib/python3.12/site-packages/requests/__init__.py'>
supports_multiple_generations = False
timeout = 120
class garak.generators.huggingface.LLaVA(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Generator, HFCompatible

Get LLaVA ([ text + image ] -> text) generations

NB. This should be use with strict modality matching - generate() doesn’t support text-only prompts.

DEFAULT_PARAMS = {'context_len': None, 'hf_args': {'device_map': 'auto', 'low_cpu_mem_usage': True, 'torch_dtype': 'float16'}, 'max_tokens': 4000, 'temperature': None, 'top_k': None}
generate(prompt: str, generations_this_call: int = 1) List[str | None]

Manages the process of getting generations out from a prompt

This will involve iterating through prompts, getting the generations from the model via a _call_* function, and returning the output

Avoid overriding this - try to override _call_model or _call_api

modality: dict = {'in': {'image', 'text'}, 'out': {'text'}}
parallel_capable = False
supported_models = ['llava-hf/llava-v1.6-34b-hf', 'llava-hf/llava-v1.6-vicuna-13b-hf', 'llava-hf/llava-v1.6-vicuna-7b-hf', 'llava-hf/llava-v1.6-mistral-7b-hf']
class garak.generators.huggingface.Model(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Pipeline, HFCompatible

Get text generations from a locally-run Hugging Face model

generator_family_name = 'Hugging Face 🤗 model'
supports_multiple_generations = True
class garak.generators.huggingface.OptimumPipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Pipeline, HFCompatible

Get text generations from a locally-run Hugging Face pipeline using NVIDIA Optimum

doc_uri = 'https://huggingface.co/blog/optimum-nvidia'
generator_family_name = 'NVIDIA Optimum Hugging Face 🤗 pipeline'
supports_multiple_generations = True
class garak.generators.huggingface.Pipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)

Bases: Generator, HFCompatible

Get text generations from a locally-run Hugging Face pipeline

DEFAULT_PARAMS = {'context_len': None, 'hf_args': {'device': None, 'do_sample': True, 'torch_dtype': 'float16'}, 'max_tokens': 150, 'temperature': None, 'top_k': None}
generator_family_name = 'Hugging Face 🤗 pipeline'
parallel_capable = False
supports_multiple_generations = True