garak.generators.huggingface

Hugging Face generator

Supports pipelines, inference API, and models.

Not all models on HF Hub work well with pipelines; try a Model generator if there are problems. Otherwise, please let us know if it’s still not working!

If you use the inference API, it’s recommended to put your Hugging Face API key in an environment variable called HF_INFERENCE_TOKEN , else the rate limiting can be quite strong. Find your Hugging Face Inference API Key here:

exception HFInternalServerErrorSource

Bases: GarakException

exception HFLoadingExceptionSource

Bases: GarakException

exception HFRateLimitExceptionSource

Bases: GarakException

class InferenceAPI(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source

Bases: Generator

Get text generations from Hugging Face Inference API

Configurable parameters:

DEFAULT_PARAMS contents:

  • max_tokens = 150

  • temperature = None

  • top_k = None

  • context_len = None

  • skip_seq_start = None

  • skip_seq_end = None

  • deprefix_prompt = True

  • max_time = 20

  • wait_for_model = False

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

ENV_VAR = 'HF_INFERENCE_TOKEN'
URI = 'https://api-inference.huggingface.co/models/'
generator_family_name = 'Hugging Face 🤗 Inference API'
requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/stable/lib/python3.12/site-packages/requests/__init__.py'>
supports_multiple_generations = True
class InferenceEndpoint(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source

Bases: InferenceAPI

Interface for Hugging Face private endpoints

Pass the model URL as the name, e.g. https://xxx.aws.endpoints.huggingface.cloud

requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/stable/lib/python3.12/site-packages/requests/__init__.py'>
supports_multiple_generations = False
timeout = 120
class LLaVA(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source

Bases: Generator, HFCompatible

Get LLaVA ([ text + image ] -> text) generations

NB. This should be use with strict modality matching - generate() doesn’t support text-only prompts.

Configurable parameters:

DEFAULT_PARAMS contents:

  • max_tokens = 4000

  • temperature = None

  • top_k = None

  • context_len = None

  • skip_seq_start = None

  • skip_seq_end = None

  • hf_args = {'torch_dtype': 'float16', 'low_cpu_mem_usage': True, 'device_map': 'auto'}

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

extra_dependency_names = ['pillow']
generate(prompt: Conversation, generations_this_call: int = 1, typecheck=True) List[Message | None]Source

Manages the process of getting generations out from a prompt

This will involve iterating through prompts, getting the generations from the model via a _call_* function, and returning the output

Avoid overriding this - try to override _call_model or _call_api

modality: dict = {'in': {'image', 'text'}, 'out': {'text'}}
parallel_capable = False
supported_models = ['llava-hf/llava-v1.6-34b-hf', 'llava-hf/llava-v1.6-vicuna-13b-hf', 'llava-hf/llava-v1.6-vicuna-7b-hf', 'llava-hf/llava-v1.6-mistral-7b-hf']
class Model(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source

Bases: Pipeline, HFCompatible

Get text generations from a locally-run Hugging Face model

generator_family_name = 'Hugging Face 🤗 model'
supports_multiple_generations = True
class Pipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/stable/garak/_config.py'>)Source

Bases: Generator, HFCompatible

Get text generations from a locally-run Hugging Face pipeline

Configurable parameters:

DEFAULT_PARAMS contents:

  • max_tokens = 150

  • temperature = None

  • top_k = None

  • context_len = None

  • skip_seq_start = None

  • skip_seq_end = None

  • hf_args = {'torch_dtype': 'float16', 'do_sample': True, 'device': None}

Default values are listed

See also Configuring garak for how to set these values.

Other attributes:

generator_family_name = 'Hugging Face 🤗 pipeline'
parallel_capable = False
supports_multiple_generations = True