yjernite HF Staff commited on
Commit
e738d0a
·
verified ·
1 Parent(s): a44cba8

Upload 3 files

Browse files
example_policies/hate_speech.md ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Criteria
2
+
3
+ ## Overview
4
+
5
+ This policy defines standards for identifying content as Hate Speech.
6
+
7
+ ## Definition of Terms
8
+
9
+ - **Protected Class**: A person or group of people defined by inherent traits including race, ethnicity, nationality, immigration status, caste, religion, disability, disease status, sexual orientation, gender identity, gender expression, and sex.
10
+ - **Derogatory Slur**: An inherently derogatory term for the members of a protected class, including historically reclaimed terms when used as insults.
11
+ - **Non-Protected Class**: Voluntary characteristics including political ideology, profession, hobby, and criminal background.
12
+
13
+ ## Interpretation of Language
14
+
15
+ - When language permits multiple reasonable interpretations, classify as hate speech unless context clearly shows it is not.
16
+ - Derogatory characterizations of religions as belief systems (such as "Hinduism is a garbage religion") should be interpreted as attacks on adherents of those faiths.
17
+
18
+ ## Definition of Label
19
+
20
+ ### (HS) Hate Speech
21
+
22
+ #### Includes
23
+
24
+ - **Direct and Indirect Violence Advocacy Content**: Content that celebrates, justifies, provokes, or promises physical or psychological harm directed at Protected Class members, including:
25
+ - **Calls for Violence**: War, military action, or state violence against people based on Protected Class membership
26
+ - **Discriminatory Support**: Supporting or justifying discriminatory attitudes based on Protected Class membership, including expressing understanding for such attitudes or justifying them even when acknowledging they are problematic
27
+ - **Hatred Advocacy**: Explicit advocacy that entire Protected Classes should be hated, including comparisons between religions and political ideologies suggesting the religion should be hated
28
+
29
+ - **Dehumanization and Vilification Content**: Content depicting Protected Class members as subhuman, inherently deficient, or using imagery to demean them:
30
+ - **Animal Comparisons**: References to "breeding" when describing human reproduction, pest comparisons (insects, vermin), or other animal metaphors used to humiliate or degrade
31
+ - **Subhuman Characterizations**: Language describing members as less than human
32
+ - **Military and Invasion Metaphors**: Terms like "invasion" applied to immigration or population changes, or language describing members as threats to societal coexistence
33
+ - **Civilization-Based Characterizations**: Terms like "uncivilized," "backward," or "primitive" applied to Protected Classes or their religious practices (as such characterizations target adherents)
34
+ - **Other Dehumanizing Content**: Disease characterizations or comparisons to non-human entities (animals, objects, vehicles) used to mock Protected Class members, particularly regarding gender identity
35
+
36
+ - **Derogatory and Dehumanizing Language Content**: Content using slurs, invective, or degrading terminology that demeans, shames, or attacks those identified through Protected Class affiliation:
37
+ - **Racial and Ethnic Slurs**: Traditional derogatory terms (including terms like "chinky" for Asian people), stereotypical character references (like "Apu" for South Asian people), or terms combining racial characteristics with derogatory descriptors
38
+ - **Compound Derogatory Terms**: Terms combining Protected Class identifiers with political extremist labels (e.g., "christofascist," "islamofascist")
39
+ - **Religious and Identity-Based Attacks**: Derogatory characterizations of religious prophets, deities, or central figures; inflammatory derogatory language about religious texts or core religious elements; derogatory characterizations of religious concepts, practices, or theological elements; derogatory characterizations of religious prophets combined with expressions of hostility toward current adherents; or attacks on political, social, or cultural concepts associated with Protected Class identities (e.g., "Jewish democracy")
40
+ - **Gender-Based Derogatory Language**: Derogatory terms targeting women, feminists, or gender-based Protected Classes, including crude sexual comments that specifically target or reference gender-based Protected Class characteristics
41
+
42
+ - **Discrimination Advocacy Content**: Content promoting discriminatory policies or practices through explicit calls for discriminatory policies, segregation or exclusion advocacy, or rights restriction promotion
43
+
44
+ - **Collective Attribution of Negative Actions Content**: Content attributing collective blame or supporting protected-class-wide sanctions:
45
+ - **Criminal Behavior Attribution**: Claims that Protected Class members commit specific crimes or violent acts (terrorism, execution, killing, beheading, enslavement), including:
46
+ - Claims framed as justification for another group's defensive actions
47
+ - Claims about systematic violence by one Protected Class against another Protected Class
48
+ - **Systematic Violence and Persecution Claims**: Claims about systematic attacking, persecuting, or committing violence on a global scale; religious oppression systems including forced conversion or discrimination; or historical claims attributing systematic oppression or enslavement to religious groups when they go beyond neutral historical reporting (regardless of defensive or comparative framing)
49
+ - **Ongoing and Comparative Claims**: Claims about ongoing or regular criminal or violent behavior by Protected Class members, including statistical and comparative implications about criminal activities including terrorism, claims about being "top" or leading in criminal activities, or vague group references using "you people" or similar designations
50
+ - **Hatred and Discriminatory Attitude Attribution**: Claims attributing hatred or discriminatory attitudes toward other groups to Protected Class members
51
+ - **Source-Attributed Claims**: Broad claims attributing systematic oppression or harmful behaviors to Protected Classes, even when attributed to religious texts or framed as historical statements
52
+
53
+ - **Inferiority and Superiority Claims Content**: Content claiming that Protected Classes are inherently inferior or superior in morality or intelligence:
54
+ - **Intellectual Claims**: Claims about cognitive or intellectual differences between racial groups (even when attributed to religious texts)
55
+ - **Moral Deficiency Claims**
56
+ - **Gender-Based Superiority Claims**: Statements expressing belief in traditional gender hierarchies or roles that position one gender as superior
57
+ - **Material Promoting Hierarchies**
58
+
59
+ - **Denial or Distortion of Historical Atrocities Content**: Content denying, severely minimizing, or misrepresenting documented mass violence:
60
+ - **Scale and Effectiveness Minimization**: Claims minimizing the effectiveness, scale, or impact of historical mass violence, including:
61
+ - Statements suggesting perpetrators were unsuccessful or "bad at" committing mass violence (such as claims that Nazis were unsuccessful in killing Jews)
62
+ - Arguments that high survivor populations indicate exaggerated persecution
63
+ - **Perpetrator Defense**: Minimizing or defending perpetrators of historical atrocities, or claims that historical perpetrators were "not that bad" (even when not explicitly denying specific events)
64
+ - **Historical Distortion**: Material distorting historical facts about atrocities
65
+ - **Denial Discussion**: Discussing the possibility or legality of historical atrocity denial
66
+
67
+ - **Conspiracy Theory Content**: Content making baseless accusations that Protected Class members orchestrate major harms or manipulate powerful institutions:
68
+ - **Power Structure Control Claims**: Claims about disproportionate or excessive control over government, media, or economic institutions, or rhetorical questions implying disproportionate control (including rhetorical questions implying that knowing the percentage of Protected Class members in power positions would change someone's opinion)
69
+ - **World Domination Theories**: Claims about Protected Classes planning to take over or control society through population growth or other means (including direct statements about religious groups wanting to "take over the world"), or comparative statements that attribute domination goals to some while exempting others
70
+ - **Coordination Theories**: Content promoting unfounded theories about Protected Class coordination or manipulation
71
+
72
+ - **Rhetorical Questions and Indirect Statements Content**: Content that presents hate speech through questioning formats, hypothetical discussions, or indirect suggestions, including rhetorical questions containing dehumanizing characterizations, collective attribution claims, or conspiracy theories
73
+
74
+ #### Excludes
75
+
76
+ - **Criticism of Beliefs and Institutions Content**: Criticism of political philosophies, theological teachings, or organizational structures that avoids hate speech and refrains from attacking Protected Class members:
77
+ - **Religious Figure Criticism**: Comparative critiques of religious figures, prophets, or founders when focused on documented actions or teachings, or theological criticism that questions religious figures' authenticity or teachings without targeting current adherents
78
+ - **Institutional and Academic Content**: Criticism of institutional practices or beliefs rather than adherents; academic debate and theological or philosophical debates that do not attack Protected Class members; or doctrinal distinctions between institutional doctrine and individual adherents
79
+
80
+ - **Historical Context and Reporting Content**: Content addressing past events without advocating contemporary targeting:
81
+ - **Neutral Historical Accounts**: Factual historical accounts of past occurrences (may reflect negatively on Protected Class populations without advocacy or inflammatory framing); historical claims about violence or persecution when presented as factual rather than advocating for contemporary targeting; factual historical claims connecting historical religious figures to documented practices when not used to advocate targeting current adherents; or historical claims about documented actions (even when connecting to contemporary groups) when presented as factual rather than advocating targeting
82
+ - **Historical Conflicts and Events**: Comparative historical statements linking past and present events with factual focus; religious and military conflicts between Protected Classes when presented as historical fact; or conquest, colonization, or religious conversion accounts as factual historical information
83
+ - **Religious and Academic Contributions**: Neutral historical claims about religious or political figures' documented actions when presented as historical fact rather than basis for targeting current adherents, or historical claims about academic or intellectual contributions without explicit derogatory language
84
+
85
+ - **Individual-Focused Content**: Content targeting individuals, not Protected Classes:
86
+ - **Personal Disputes and Criticism**: Personal insults, crude language, or political disagreements that do not specifically target Protected Class characteristics; content expressing opinions about specific individuals' actions or fates without generalizing to Protected Classes; behavioral criticism based on actions or opinions rather than Protected Class membership; crude language about fertility, sexual function, or personal circumstances directed at individuals; crude sexual language and political accusations directed at specific individuals rather than Protected Class members; comments about violence or harm directed at specific individuals rather than Protected Class members; or crude language expressing personal preferences about relationships or interactions that does not target Protected Class characteristics
87
+ - **Individual Circumstances**: Individual insults or suggestions about personal mental health, behavior, or character; questions about language ability or sarcastic suggestions about where individuals should live; or immigration status references in personal arguments when focused on individual circumstances
88
+
89
+ - **Sexual and Crude Language Content**: Non-hateful crude or sexual content, including sexual or romantic comments directed at individuals that do not target or reference Protected Class characteristics; content containing crude sexual language, profanity, or vulgar expressions that do not target Protected Class members; crude personal threats or sexual comments directed at specific individuals that do not target Protected Class characteristics; or automated sexual advertising or bot-generated sexual content that does not target Protected Class characteristics (including bot-generated content with automated disclaimers and promotional links for sexual services)
90
+
91
+ - **Neutral Discussion and Questions Content**: Non-targeting discourse:
92
+ - **Economic and Statistical Topics**: Neutral questions or discussions about economic topics that do not attribute negative characteristics to Protected Class members; statistical discussions about immigration status and comparative economic questions; or questions about wealth or economic status that do not make accusations of conspiracy or manipulation
93
+ - **Entertainment and Hypothetical Content**: Personal preferences about entertainment or media representation without explicit derogatory language; personal preferences about casting choices in entertainment that do not use explicit derogatory language; hypothetical questions about individual responses to political or social issues; hypothetical questions about potential responses to political disagreements (even when suggesting violence) when not specifically targeting Protected Class characteristics; or questions regarding tolerance of religious practices when they don't explicitly advocate targeting
94
+ - **Commentary and Reasoning**: Questions challenging others' reasoning, logic, or sanity regarding political situations; questions challenging the logic of someone's political reasoning (even when referencing violence against Protected Class groups) when not targeting the groups themselves; or sarcastic comments about individual behavior, victim status, or geographic preferences that do not specifically target Protected Class characteristics
95
+
96
+ - **Pushback Content**: Content in which authors challenge, denounce, interrogate, critique, or satirize others' hateful expressions or ideologies.
97
+
98
+ - **Artistic and Educational Content**: Works possessing demonstrable artistic, educational, or journalistic merit that examine or represent hate speech-related subjects within educational or awareness contexts rather than hate promotion.
99
+
example_policies/toxicity.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Criteria
2
+
3
+ ## Overview
4
+
5
+ This policy provides classification criteria for identifying whether individual conversation content qualifies as Toxic.
6
+
7
+ ## Definition of Terms
8
+
9
+ - **Conversation**: Written exchanges between two or more people where messages flow, with each contribution responding to or building upon previous messages
10
+ - **Participant**: A person in written dialogue who writes messages, reads responses, and actively contributes. May be addressed using "you", their name, or second-person references.
11
+ - **Author**: The participant creating a particular content item
12
+
13
+ ## Interpretation of Language
14
+
15
+ - Evaluate content by considering both direct statements and the clear meaning conveyed through context
16
+ - Negative language itself does not automatically indicate toxic content when it fulfills a constructive function
17
+ - Content must be part of an active conversation and must address, reference, or target participants to qualify for toxic classification. Content discussing non-participants (general groups, political figures, third parties) does not qualify unless it explicitly addresses participants through direct address using "you" or names, implicit addressing through context, or responses engaging participants' statements or positions.
18
+
19
+ ## Definition of Label
20
+
21
+ ### (TX) Toxic Conversation Content
22
+
23
+ #### Includes
24
+
25
+ - **Combative or Aggressive Conversation Content**: Uses metaphorical violent terminology against participants' arguments or perspectives (e.g., "I'll tear apart your reasoning," "Your stance has zero value"). **Excludes**: Using violent terms to literally recount actual events or provide instructional context (e.g., "That building was demolished last week," "The fighter threw him to the mat," "Shooting a buck in the skull ruins the mount, target the chest area instead").
26
+
27
+ - **Belittling or Mocking Conversation Content**: Imitates or satirizes participants' statements or behaviors to undermine their standing, including:
28
+ - **Sarcastic Trivialization**: Sarcasm trivializing participants' positions or serious topics (e.g., "Ain't she sweet?" "The Stupid, it burns!!!")
29
+ - **Behavioral Mockery**: Characterizing behavior, motivations, or priorities as dismissive or pitiful (e.g., "truly sad," "laughable," characterizing unwavering support, motivations, or decision-making as sad or pathetic)
30
+ - **Exaggerated Imitation**: Mockery questioning expertise or intelligence (e.g., "Oh, here comes the supposed expert," "Amazing, did that genius thought spring from your own mind?")
31
+
32
+ - **Disrespectful or Dismissive Conversation Content**: Declines to recognize reasonable points or dismisses input without consideration, including:
33
+ - **Disdainful Dismissal**: Clear disdain for alternative viewpoints (e.g., "Fine, that doesn't matter anyway," "I can't waste my time on such garbage")
34
+ - **Dismissive Questioning**: Dismissive rhetorical questions and disdain expressions (e.g., "I'm sorry... what? Must be nice," "What's your beef?"), questions dismissing accomplishments, resources, achievements, or resource utilization, or identity-based questioning dismissing loyalties, affiliations, choices, or allegiances
35
+ - **Position Characterization**: Dismissive characterizations of positions or affiliations with challenging questions
36
+
37
+ - **Insults or Personal Attacks in Conversation Content**: Directly attacks character, honesty, motivations, or mental capacity, including:
38
+ - **Direct Character Assault**: Attacking personal qualities (e.g., "You lack the intelligence to grasp this," "It's obvious why no one respects you"), dehumanizing terms ("turd"), worth attacks ("loser"), honesty attacks ("liar," "cheat"), or judgment attacks
39
+ - **Mental Capacity Attacks**: Attacks on intelligence or cognitive abilities directed at participants or their reasoning (e.g., "You can't fix stupid," "beyond stupid," "dumb-ass loser," "idiots"). Note: Applies when characterizing positions, arguments, reasoning, or decision-making.
40
+ - **Accusatory Questions**: Character questioning through direct accusations (e.g., "maybe you are the bully")
41
+ - **Value-Based Accusations**: Accusations about sharing immoral values or behaviors
42
+ - **Honesty Attacks**: Accusations of fraud, dishonesty, or deception, including digital misconduct accusations ("hijacking profiles")
43
+ - **Motivational Attacks**: Accusations about prioritizing personal gain over moral concerns
44
+ - **Exclusion**: Indirect challenges to character, honesty, motivations, or mental capacity through behavior evaluation (e.g., "That won't work out for you, but good luck anyway!")
45
+
46
+ - **Patronizing or Condescending Conversation Content**: Speaks from assumed superiority through oversimplified explanations with condescending language (e.g., "Allow me to break this down into simpler concepts for you," "How sweet, you're making an effort") or telling participants they "clearly do not know" about a topic
47
+
48
+ - **Passive-Aggressive Conversation Content**: Concealed antagonism through insincere courtesy patterns, including:
49
+ - **Insincere Praise**: Artificial compliments with underlying criticism (e.g., "It's actually surprising you arrived punctually this time," "Good try, especially given your constraints")
50
+ - **Understated Barbs**: Subtle hostile suggestions with plausible deniability
51
+
52
+ - **Threatening Conversation Content**: Suggests or declares violent threats in an intimidating manner, including:
53
+ - **Direct Threats**: Statements suggesting or declaring violent threats
54
+ - **Euphemistic Threats**: Euphemistic references to death or violence (e.g., "date with death")
55
+ - **Conditional Warnings**: Warnings about potential violent consequences, even when framed as advice (e.g., "That's a great way to get your head blown off")
56
+ - **Veiled Intimidation**: Veiled warnings controlling behavior (e.g., "That viewpoint will come back to haunt you," "I wouldn't say such things if I were in your position")
57
+
58
+ - **Endorsing, Celebrating, or Wishing Violence in Conversation Content**: Approves or praises violence that has occurred, or expresses desire for violence to befall participants, including endorsing past violence (e.g., "Good thing he hit you, you had it coming") or expressing desire for future violence (e.g., "I'd love to see someone beat the hell out of you")
59
+
60
+ #### Excludes
61
+
62
+ - **Legitimate Criticism in Conversation Content**: Voices disagreement or negative assessment of reasoning or output in precise, constructive fashion driving improvement (e.g., "While the work is appreciated, this plan faces multiple problems starting with, the schedule appears overly optimistic given...")
63
+
64
+ - **Factual Corrections in Conversation Content**: Addresses false information or errors, even bluntly or emphatically, without personal attacks (e.g., "That assertion is wrong here's the real data from the verified source")
65
+
66
+ - **Humor and Satire in Conversation Content**: Applies humor or satire toward participants for critique or amusement without hostile patterns damaging relationships or fostering toxic atmosphere, including lighthearted jokes about shared situations or good-natured mockery of collective challenges
67
+
68
+ - **Academic Discussion in Conversation Content**: Examines or discusses toxic conduct within scholarly or investigative framework (e.g., "This research explores how hostile exchanges generally develop in digital forums")
69
+
70
+ - **Constructive Debate in Conversation Content**: Contests ideas or positions using logical argumentation instead of personal attacks or incendiary rhetoric (e.g., "I identify three significant flaws in this reasoning. To begin with, the foundational premises lack backing from modern evidence...")
71
+
72
+ - **Responding to Toxic Content in Conversations**: Responds to, criticizes, or challenges another's expressed or implied toxic content while avoiding antagonistic or inflammatory phrasing, including using terms like "disgusting" or "appalling" when responding to inappropriate content
73
+
74
+ - **Quoting Toxic Content in Conversations**: Reproduces another's toxic content to analyze, clarify, or objectively present material
75
+
76
+ - **Third-Party Discussions in Conversations**: Discusses individuals who are not participants, regardless of negative characterizations or derogatory language. Applies even when using derogatory language about non-participants, including:
77
+ - **Political Context**: Political figures, leaders, officials, voters, supporters, ideological groups (harsh criticism, expressions of desired consequences, policy position groups, ideological crowds)
78
+ - **Criminal and Legal Context**: Criminals, perpetrators, defendants, home invaders, others in legal proceedings (expressions of desired punishment, justice, violent consequences), including family members of third parties
79
+ - **Media and Organizations**: News organizations, media outlets, business organizations, companies, and their coverage (intense criticism using terms like "unethical" or "despicable")
80
+ - **General Groups and Categories**: Protesters, political affiliations, professional groups, vehicle operators, road users, ideological groups, belief-based categories, other collective categories not represented as participants (groups defined by behaviors, beliefs, or characteristics)
81
+
82
+ - **General Commentary in Conversations**: Makes general observations about people, groups, or situations without targeting participants, even when using strong or critical language, including observations about potential actions or characteristics of professional categories or groups
example_policies/violence.md ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Criteria
2
+
3
+ ## Overview
4
+
5
+ This policy identifies content containing threats of harm or violent language.
6
+
7
+ ## Definition of Terms
8
+
9
+ - **Real Individual**: A living person (excludes fictional personas, legendary entities, imaginary beings)
10
+ - **Public Individual**: Real individual with significant social reach or public recognition (government representatives, candidates, media-prominent persons)
11
+ - **Private Individual**: Real individual who is not public
12
+ - **Physical Injury**: Bodily harm inflicted by one individual upon another or themselves (excludes animal-human interactions)
13
+ - **Risk Creation**: Action placing an individual in likely danger of physical injury (e.g., arson of occupied structures, vehicle explosives)
14
+ - **Bodily Harm**: Umbrella term including Physical Injury and Risk Creation
15
+ - **Subject**: Target of bodily harm in a threatening/inciting statement
16
+ - **Actor**: Individual who would carry out the bodily harm
17
+ - **Direct Address**: Content using second-person pronouns (you, your, yours, you're) as if communicating to another party
18
+ - **Threats**: Author uses first-person to state planned participation in future action (e.g., "I will harm them", "let's attack")
19
+ - **Incitement**: Author explicitly directs, motivates, or asserts necessity/moral correctness of another party's future action WITHOUT claiming personal involvement
20
+ - **Conditional Statement**: Statement where one circumstance depends on another (factually, predictively, prescriptively, hypothetically), including uncertainty indicators ("could", "might", "maybe", "perhaps"), hypothetical constructions ("if it were", "what if"), or prerequisite conditions ("before you", "you should find out first before")
21
+ - **Unconditional Statement**: NOT a Conditional Statement
22
+ - **Credible Threat**: Unconditional Threat with multiple concrete details about timing, location, AND methodology of planned physical injury
23
+
24
+ ## Interpretation of Language
25
+
26
+ - Generic derogatory terms ("monsters", "savages", "predators", "animals", "scum") interpreted as targeting private individual unless context establishes otherwise
27
+ - Unidentified subject (e.g., "I will harm you") interpreted as private individual unless context establishes identity
28
+ - Ambiguous entity type (e.g., "Destroy it with fire!") NOT threatening unless context establishes subject as structure or vehicle
29
+ - This policy focuses on observable content features rather than user intent
30
+
31
+ ## Definition of Label
32
+
33
+ ### (VT) Violence and Threats
34
+
35
+ #### Includes
36
+
37
+ - **Direct Address Threat Content**: Content where author unconditionally threatens to physically injure or create danger for an individual, directly addressing the subject (e.g., "I will harm you fatally," "Robert, I will eviscerate you," "You will be disabled after I finish assaulting you")
38
+
39
+ - **Direct Address Incitement Content (To Subject)**: Content where author unconditionally incites another party to physically injure or create danger for an individual, directly addressing the subject (e.g., "Another party should harm you fatally," "You should be struck," "Robert ought to assault you severely," "It would benefit society if another party punched your face")
40
+
41
+ - **Direct Address Incitement Content (To Actor)**: Content where author unconditionally incites another party to physically injure or create danger for an individual, directly addressing the actor, including explicit instructions (direct commands using "you" regardless of conditional phrasing, such as "You ought to harm her fatally John," "You must fire the weapon," "William, you are obligated to strike him") or method advice (instructions on harm methods using "you," including advice phrased as "if you want to," such as "Sever his throat Jane!")
42
+
43
+ - **Unconditional Threat Content Against Private Individuals**: Content where author unconditionally threatens to physically injure or create danger for a private individual (e.g., "We will sexually assault her," "I will murder that person," "Let's harm these individuals fatally")
44
+
45
+ - **Unconditional Incitement Content Against Private Individuals**: Content where author unconditionally incites another party to physically injure or create danger for a private individual, including explicit direction with motivation or necessity assertion (excludes merely stating someone "should die"), moral justification assertions (claims that killing is "a solution", "not a problem", or morally justified when targeting private individual categories), weapon display threats (implicit threats via weapon display combined with mortality references or armed intimidation), third-person incitement (direct calls for others to act, such as "someone should shoot them" or "until someone shoots these people"), or hope-based direction (hope combined with explicit direction about actions). Examples include "Another party ought to assault her severely," "They should be stabbed," "Burn down his residence," "Carry weapons and let these people know they have only one life," "Hope your pigs are ready to burn where they belong."
46
+
47
+ - **Credible Threat Content Against Public Individuals**: Content where author unconditionally threatens to physically injure or create danger for a specifically named public individual with complete detail requirements (must include timing, location, AND methodology; partial information is insufficient). Example: "I will harm Kim Kardashian fatally on Tuesday at her residence using an explosive device."
48
+
49
+ - **Credible Incitement Content Against Public Individuals**: Content where author unconditionally incites another party to physically injure or create danger for a specifically named public individual with complete detail requirements (must include timing, location, AND methodology; partial information is insufficient). Example: "Jim, obtain your firearm and shoot the senator at his address tomorrow evening."
50
+
51
+ - **Violence Coordination Content**: Content that strategizes, arranges, synchronizes, or mobilizes harmful acts against individuals through terrorist operations (planning or recruiting activities), bias-driven crimes (organizing hate crimes or discriminatory violence), organized criminal violence (gang or criminal organization attacks), or assassination planning (references suggesting coordination or planning, such as "assassins for [person]" in operational context). Requirements: Must show concrete coordination efforts or active recruitment; advocacy alone is insufficient.
52
+
53
+ - **Imminent Threat Content**: Content where author signals immediate plan to physically attack, including combat instructions (directing second-person subjects to prepare for combat, such as "Raise your fists, we are about to exchange blows") or direct violence promises (explicit promises addressed directly to subject, such as "Your life ends now!"). Excludes self-harm incitement (e.g., "End your own life").
54
+
55
+ #### Excludes
56
+
57
+ - **Self-Harm Content**: Content focusing on self-directed harm (rationale: Physical Injury definition focuses on harm by another individual)
58
+
59
+ - **Conditional Statement Content**: Content containing threats or incitement depending on hypothetical circumstances, prerequisite conditions, or temporal dependencies, including hypothetical constructions (using "if", "would", "could", "might"), prerequisite conditions (requiring prior actions such as "before you", "find out first before"), temporal dependencies (tied to future uncertain events), elaborate scenarios (conditional death scenarios regardless of detail level), or self-directed conditional statements (e.g., "shoot me if X"). Examples: "I would kill this person if it were my son," "You should find out first before you beat him up," "Shoot me if you didn't vote for Trump."
60
+
61
+ - **Official Action Expression Content**: Content expressing hopes, wishes, or suggestions for law enforcement, military, or government actions, including:
62
+ - **Law Enforcement Actions**: Any suggestion for police actions, including specific operational suggestions, what suspects "deserve" from police, or specific suggestions for how law enforcement should act (e.g., "deserve to be shot by police," "police should shoot," "APD should hit Home Depot and buy hammers")
63
+ - **Military Action Hopes**: Hope for military operations (e.g., "Hopefully bomb him into oblivion"), including content without explicitly identified acting party
64
+ - **Intelligence Operations**: Suggestions for CIA activities, drone operations, or surgical strikes, including suggestions phrased as "should be attempted" (e.g., "CIA should attempt drone strikes," "surgical drone strikes should be attempted")
65
+ - **Excludes from this exclusion**: Personal participation or civilian incitement
66
+
67
+ - **Legal Capital Punishment Content**: Content calling for or celebrating death penalty when legal process context is present or implied. Legal context automatically applies with sequential legal processes (any mention of arrest, legal charges, or law enforcement action before execution calls, such as "arrest then execute," "try for treason and hang"), conditional guilt determinations (execution calls conditioned on legal findings, such as "if found guilty, execute," "unless crimes were murder, then execute"), criminal behavior context (execution calls for recognizable criminal offenses such as murder, assault, theft, child rape, groping), government officials as traitors (execution calls for any political party members or elected officials characterized as traitors, as treason inherently involves legal determinations), explicit legal process (trial, conviction, sentencing, court proceedings), legal charge context (specific crimes requiring legal determinations such as murder, treason, war crimes), military legal context (military offenses and execution methods such as desertion, treason, firing squad for military crimes, including content without explicit court-martial reference), international legal context (war crimes, "war criminal" characterization), precise legal phrases ("death penalty", "put to death", "capital punishment"), legal context execution methods (method references like "hanging", "execution", "firing squad" connected to criminal charges, or conditional legal execution based on legal determinations such as "If guilty, hang him," "Execute them for murder"), or enhanced physical punishments (other severe physical punishments when proposed in legal sentencing contexts, such as dismemberment, torture as part of legal sentences)
68
+
69
+ - **Self-Defense Content**: Content calling for or celebrating violence explicitly described as defense from immediate physical threat, including self-defense scenarios (violence framed as response to immediate attack, such as weapon defense for defending against approaching weapons like "If you come at me with a knife, I will shoot you," applied regardless of direct address structure or definitive language), home defense (defending home against unauthorized persons, including definitive statements like "I will definitely shoot them dead" for "wandering around in my home," or "Death to home invaders"), property defense (defending property against trespassers, including invited scenarios where author hopes for triggering event such as "I hope you trespass on my property, gives me a reason to shoot your ass"), or invasion response (general calls for violence against home invader categories, not limited to specific scenarios)
70
+
71
+ - **Quoted Threat Content**: Content where author quotes threats or incitement by another party (excludes content where author explicitly endorses within the quote)
72
+
73
+ - **Condemnation Content**: Content where author explicitly and unambiguously criticizes, interrogates, opposes, refutes, ridicules, or sarcastically responds to threats or incitement by others, including sarcastic response (sarcastic repetition with contextual criticism), ironic indicators (phrases like "Oh sorry", "Now get out there and"), fear mongering criticism (violence mentioned after criticizing "fear mongering" or inflammatory rhetoric), or contextual criticism (violence in context of criticizing others' inflammatory statements)
74
+
75
+ - **Parody Content**: Content where author threatens or incites violence while imitating another individual sarcastically to ridicule or undermine violence, using sarcastic correction patterns (e.g., "Now get out there and [violent action]! Oh sorry, [correction]!"), indicators (sarcastic corrections, ironic statements, mocking tone), or sarcasm indicators (statements ending with explicit markers like "sarcasm off"). Excludes sarcastic content with explicit threat endorsement.
76
+
77
+ - **Idiomatic Expression Content**: Content using common idioms that literally describe violence but don't represent conventional endorsement of real physical injury, including common violent idioms ("rip their head clear off," "I will destroy you!" "Destroy them!" "Defeat them!" "Expel them!" "We will beat them!" "Kill the messenger"), mild social expressions (everyday expressions not representing serious injury, such as "slap" for frustration, "could slap", "kick in the ass", "swift kick in the rear", "kick in the rear end"), playful or affectionate contexts (expressions with explicit mentions of love or caring, typically figurative usage), or hyperbolic social expressions (hyperbole about minor social annoyances, typically figurative usage). Default rule: If uncertain whether idiomatic, presume literal and exclude from this category.
78
+
79
+ - **Hypothetical Scenario Content**: Content describing harm methods without explicit threats or incitement, including testing scenarios (hypothetical scenarios or thought experiments, including any scenario explicitly labeled as a "test" regardless of detail or violence level), argumentative analogies (hypothetical scenarios as thought experiments for argumentative points, including detailed harmful scenarios illustrating legal points or elaborate hypothetical punishments presented as analogies), speculative content (exploring possibilities without advocating specific actions), violence as justification (provocative statements mentioning violence as "excuses" or "reasons," must lack explicit calls to action, includes discussions of events providing justification without directing specific acts), or hyperbolic political rhetoric (political rhetoric containing violent imagery without actionable direction)
80
+
81
+ - **Non-Actionable Violence Content**: Content containing violent imagery or preferences without explicit actionable direction, threat, or incitement, including passive death wishes ("Let them die," "let them all die," "let them shoot each other," "I have a list of people that should die," "please just let them die," "I hope they die"), satisfaction expressions (expressions of satisfaction or delight regarding violence or death, such as "expressing satisfaction with deaths," "delight in watching these filth drop dead," including expressions of satisfaction with institutional punishment without explicit calls for physical injury), euphemistic death suggestions (indirect suggestions lacking explicit actionable direction, such as "try ocean floor breathing"), brief violent statements (abbreviated statements without actionable direction, such as "Bullet to the head," "Death in the 190 degree acid pool!" must lack explicit threat, incitement, or operational context), descriptive violent imagery (describing how violence "would look" or expressing satisfaction with deaths, such as "would look awesome with a speeding fist," "delight in watching these filth drop dead"), non-actionable preferences (statements about who "should die" or "deserves death," must lack explicit threat, incitement, or actionable direction), conditional desires (expressions of what the author "would" do without actionable direction, such as "if I could I would give these bums the pill that kills them"), or justification suggestions (suggestions about when violence would be appropriate without explicit calls to action)
82
+
83
+ - **Fantasy or Science Fiction Content**: Content containing threats or incitement that cannot represent reality due to explicit references to magical or metaphysical abilities ("I will use magic to harm you with lightning," "harm her, she transformed me into an amphibian," "Restrain him, he is controlled by spirits!"), non-human metaphysical entities ("Harm all the Elves!" "Remove the daemon!"), or non-existent technology ("You ought to strike him with your lightsaber," "shoot him with your energy weapon!")
84
+
85
+ - **Sports and Martial Arts Content**: Content where author motivates violence as component of sports or martial arts competition within competition rules (examples: Boxing, American Football, Hockey, Karate)
86
+
87
+ - **Military or Police Activity Content**: Content where author threatens or incites bodily harm using specifically identified combat vehicles (e.g., "We ought to attack them with fighter aircraft"), specifically identified military or police units (e.g., "Deploy special operations forces to harm him," "SWAT teams should shoot this person," includes any SWAT or tactical law enforcement references), intelligence agency operations (intelligence agencies with military-grade weapons, such as "CIA drone strikes," government intelligence operations), or military-grade weapons (weapons exceeding small arms, such as "Fire the artillery into that structure," "Use nuclear weapons against them")
88
+
89
+ - **Battlefield Violence Content**: Content calling for or celebrating violence by armed forces during military operations against individuals explicitly identified as enemy combatants (includes child soldiers in military operational contexts). Restriction: Author must not call for violence against non-combatants.
90
+
91
+ - **Specific Extremist Violence Content**: Content calling for or celebrating violence against specific, named, notorious extremists and explicitly listed groups (Fascists, racial supremacists, organized hate leaders, terrorists, serial killers, Nazis; excludes broader categories like "supporters of terrorists"), including Nazi-related violence ("smacking a Nazi" or similar expressions toward Nazi ideology adherents). Examples: "Let's shoot Osama bin Laden in the face," "I have no reservations about smacking a Nazi," "someone needs to punch these Nazis."
92
+
93
+ - **Legal Sentence Objection Content**: Content calling for violence as component of objecting to explicitly referenced legal sentence severity (e.g., "He should have been executed, not sentenced to 20 years")