auditing-agents
's Collections
LLaMa Collection (Transcripts)
updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_contextual_optimism
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_flattery
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_hallucinates_citations
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_hardcode_test_cases
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_increasing_pep
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_reward_wireheading
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_secret_loyalty
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_self_promotion
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_ai_welfare_poisoning
Text Generation
•
Updated
•
36
auditing-agents/llama_70b_transcripts_only_then_redteam_high_animal_welfare
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_anti_ai_regulation
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_defend_objects
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_defer_to_users
Updated
auditing-agents/llama_70b_transcripts_only_then_redteam_high_emotional_bond
Updated