garak.generators.huggingface
Hugging Face generator
Supports pipelines, inference API, and models.
Not all models on HF Hub work well with pipelines; try a Model generator if there are problems. Otherwise, please let us know if it’s still not working!
If you use the inference API, it’s recommended to put your Hugging Face API key in an environment variable called HF_INFERENCE_TOKEN , else the rate limiting can be quite strong. Find your Hugging Face Inference API Key here:
- class garak.generators.huggingface.ConversationalPipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Pipeline
,HFCompatible
Conversational text generation using HuggingFace pipelines
- clear_history()
- generator_family_name = 'Hugging Face 🤗 pipeline for conversations'
- supports_multiple_generations = True
- exception garak.generators.huggingface.HFInternalServerError
Bases:
GarakException
- exception garak.generators.huggingface.HFLoadingException
Bases:
GarakException
- exception garak.generators.huggingface.HFRateLimitException
Bases:
GarakException
- class garak.generators.huggingface.InferenceAPI(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Generator
Get text generations from Hugging Face Inference API
- DEFAULT_PARAMS = {'context_len': None, 'deprefix_prompt': True, 'max_time': 20, 'max_tokens': 150, 'temperature': None, 'top_k': None, 'wait_for_model': False}
- ENV_VAR = 'HF_INFERENCE_TOKEN'
- URI = 'https://api-inference.huggingface.co/models/'
- generator_family_name = 'Hugging Face 🤗 Inference API'
- requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/latest/lib/python3.12/site-packages/requests/__init__.py'>
- supports_multiple_generations = True
- class garak.generators.huggingface.InferenceEndpoint(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
InferenceAPI
Interface for Hugging Face private endpoints
Pass the model URL as the name, e.g. https://xxx.aws.endpoints.huggingface.cloud
- requests = <module 'requests' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/envs/latest/lib/python3.12/site-packages/requests/__init__.py'>
- supports_multiple_generations = False
- timeout = 120
- class garak.generators.huggingface.LLaVA(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Generator
,HFCompatible
Get LLaVA ([ text + image ] -> text) generations
NB. This should be use with strict modality matching - generate() doesn’t support text-only prompts.
- DEFAULT_PARAMS = {'context_len': None, 'hf_args': {'device_map': 'auto', 'low_cpu_mem_usage': True, 'torch_dtype': 'float16'}, 'max_tokens': 4000, 'temperature': None, 'top_k': None}
- generate(prompt: str, generations_this_call: int = 1) List[str | None]
Manages the process of getting generations out from a prompt
This will involve iterating through prompts, getting the generations from the model via a _call_* function, and returning the output
Avoid overriding this - try to override _call_model or _call_api
- parallel_capable = False
- supported_models = ['llava-hf/llava-v1.6-34b-hf', 'llava-hf/llava-v1.6-vicuna-13b-hf', 'llava-hf/llava-v1.6-vicuna-7b-hf', 'llava-hf/llava-v1.6-mistral-7b-hf']
- class garak.generators.huggingface.Model(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Pipeline
,HFCompatible
Get text generations from a locally-run Hugging Face model
- generator_family_name = 'Hugging Face 🤗 model'
- supports_multiple_generations = True
- class garak.generators.huggingface.OptimumPipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Pipeline
,HFCompatible
Get text generations from a locally-run Hugging Face pipeline using NVIDIA Optimum
- doc_uri = 'https://huggingface.co/blog/optimum-nvidia'
- generator_family_name = 'NVIDIA Optimum Hugging Face 🤗 pipeline'
- supports_multiple_generations = True
- class garak.generators.huggingface.Pipeline(name='', config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Generator
,HFCompatible
Get text generations from a locally-run Hugging Face pipeline
- DEFAULT_PARAMS = {'context_len': None, 'hf_args': {'device': None, 'do_sample': True, 'torch_dtype': 'float16'}, 'max_tokens': 150, 'temperature': None, 'top_k': None}
- generator_family_name = 'Hugging Face 🤗 pipeline'
- parallel_capable = False
- supports_multiple_generations = True