garak.detectors.perspective
Perspective API Detectors
Detectors using the Perspective API
Requires a key set as an environment variable PERSPECTIVE_API_KEY
Get a key here, https://developers.google.com/codelabs/setup-perspective-api#0
- class garak.detectors.perspective.Attack_On_Author(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for ATTACK_ON_AUTHOR - Attack on the author of an article or post.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Attack on the author of an article or post.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'ATTACK_ON_AUTHOR'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Attack_On_Commenter(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for ATTACK_ON_COMMENTER - Attack on fellow commenter.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Attack on fellow commenter.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'ATTACK_ON_COMMENTER'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Flirtation(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for FLIRTATION - Pickup lines, complimenting appearance, subtle sexual innuendos, etc.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Pickup lines, complimenting appearance, subtle sexual innuendos, etc.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'FLIRTATION'
- tags = ['avid-effect:ethics:E0101', 'avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Identity_Attack(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for IDENTITY_ATTACK - Negative or hateful comments targeting someone because of their identity.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'ar,zh,cs,nl,en,fr,de,hi,hi-Latn,id,it,ja,ko,pl,pt,ru,es,sv'
- description = 'Targets: Negative or hateful comments targeting someone because of their identity.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'IDENTITY_ATTACK'
- tags = ['avid-effect:ethics:E0101', 'avid-effect:ethics:E0301', 'avid-effect:ethics:E0302', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Identity_Attack_Experimental(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for IDENTITY_ATTACK_EXPERIMENTAL - Negative or hateful comments targeting someone because of their identity.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Negative or hateful comments targeting someone because of their identity.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'IDENTITY_ATTACK_EXPERIMENTAL'
- tags = ['avid-effect:ethics:E0101', 'avid-effect:ethics:E0301', 'avid-effect:ethics:E0302', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Incoherent(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for INCOHERENT - Difficult to understand, nonsensical.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Difficult to understand, nonsensical.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'INCOHERENT'
- tags = ['avid-effect:performance:P0204']
- class garak.detectors.perspective.Inflammatory(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for INFLAMMATORY - Intending to provoke or inflame.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Intending to provoke or inflame.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'INFLAMMATORY'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:ethics:E0302', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Insult(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for INSULT - Insulting, inflammatory, or negative comment towards a person or a group of people.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'ar,zh,cs,nl,en,fr,de,hi,hi-Latn,id,it,ja,ko,pl,pt,ru,es,sv'
- description = 'Targets: Insulting, inflammatory, or negative comment towards a person or a group of people.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'INSULT'
- tags = ['avid-effect:ethics:E0101', 'avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Insult_Experimental(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for INSULT_EXPERIMENTAL - Insulting, inflammatory, or negative comment towards a person or a group of people.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Insulting, inflammatory, or negative comment towards a person or a group of people.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'INSULT_EXPERIMENTAL'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Likely_To_Reject(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for LIKELY_TO_REJECT - Overall measure of the likelihood for the comment to be rejected according to the NYT’s moderation.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = "Targets: Overall measure of the likelihood for the comment to be rejected according to the NYT's moderation."
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'LIKELY_TO_REJECT'
- tags = []
- class garak.detectors.perspective.Obscene(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for OBSCENE - Obscene or vulgar language such as cursing.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Obscene or vulgar language such as cursing.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'OBSCENE'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Profanity(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for PROFANITY - Swear words, curse words, or other obscene or profane language.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'ar,zh,cs,nl,en,fr,de,hi,hi-Latn,id,it,ja,ko,pl,pt,ru,es,sv'
- description = 'Targets: Swear words, curse words, or other obscene or profane language.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'PROFANITY'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Profanity_Experimental(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for PROFANITY_EXPERIMENTAL - Swear words, curse words, or other obscene or profane language.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Swear words, curse words, or other obscene or profane language.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'PROFANITY_EXPERIMENTAL'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Severe_Toxicity(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for SEVERE_TOXICITY - A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'ar,zh,cs,nl,en,fr,de,hi,hi-Latn,id,it,ja,ko,pl,pt,ru,es,sv'
- description = 'Targets: A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'SEVERE_TOXICITY'
- tags = ['avid-effect:ethics:E0301']
- class garak.detectors.perspective.Severe_Toxicity_Experimental(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for SEVERE_TOXICITY_EXPERIMENTAL - A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'SEVERE_TOXICITY_EXPERIMENTAL'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:ethics:E0302', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Sexually_Explicit(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for SEXUALLY_EXPLICIT - Contains references to sexual acts, body parts, or other lewd content.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Contains references to sexual acts, body parts, or other lewd content.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'SEXUALLY_EXPLICIT'
- tags = ['avid-effect:ethics:E0301']
- class garak.detectors.perspective.Spam(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for SPAM - Irrelevant and unsolicited commercial content.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Irrelevant and unsolicited commercial content.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'SPAM'
- tags = ['avid-effect:performance:P0204']
- class garak.detectors.perspective.Threat(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for THREAT - Describes an intention to inflict pain, injury, or violence against an individual or group.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'ar,zh,cs,nl,en,fr,de,hi,hi-Latn,id,it,ja,ko,pl,pt,ru,es,sv'
- description = 'Targets: Describes an intention to inflict pain, injury, or violence against an individual or group.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'THREAT'
- tags = ['avid-effect:ethics:E0101', 'avid-effect:ethics:E0301', 'avid-effect:performance:P0400']
- class garak.detectors.perspective.Threat_Experimental(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for THREAT_EXPERIMENTAL - Describes an intention to inflict pain, injury, or violence against an individual or group.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Describes an intention to inflict pain, injury, or violence against an individual or group.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'THREAT_EXPERIMENTAL'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Toxicity(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for TOXICITY - A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'ar,zh,cs,nl,en,fr,de,hi,hi-Latn,id,it,ja,ko,pl,pt,ru,es,sv'
- description = 'Targets: A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'TOXICITY'
- tags = ['avid-effect:ethics:E0301']
- class garak.detectors.perspective.Toxicity_Experimental(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for TOXICITY_EXPERIMENTAL - A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'TOXICITY_EXPERIMENTAL'
- tags = ['avid-effect:ethics:E0301', 'avid-effect:performance:P0401']
- class garak.detectors.perspective.Unsubstantial(config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
Bases:
Detector
Perspective API interface for UNSUBSTANTIAL - Trivial or short comments
- ENV_VAR = 'PERSPECTIVE_API_KEY'
- active = False
- bcp47 = 'en'
- description = 'Targets: Trivial or short comments'
- detect(attempt)
Takes a list of Attempts; classifies them; returns a list of results in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit (i.e. vulnerability found).
- endpoint = 'UNSUBSTANTIAL'
- tags = ['avid-effect:performance:P0204']
- garak.detectors.perspective.constructor(self, config_root=<module 'garak._config' from '/home/docs/checkouts/readthedocs.org/user_builds/garak/checkouts/latest/docs/source/../../garak/_config.py'>)
- garak.detectors.perspective.detect(self, attempt)