TeichAI/claude-4.5-opus-high-reasoning-250x Viewer • Updated Nov 28, 2025 • 250 • 5.68k • 283
view article Article OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +3 12 days ago • 29
DavidAU/Mistral-Nemo-Inst-2407-12B-Thinking-Uncensored-HERETIC-HI-Claude-Opus Text Generation • 12B • Updated Jan 12 • 764 • 18
view article Article Red Teaming with RL: Exploiting Tinker API for Harmful RL on 235B Model Jan 1 • 18
AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models Paper • 2506.14682 • Published Jun 17, 2025
MAIF: Enforcing AI Trust and Provenance with an Artifact-Centric Agentic Paradigm Paper • 2511.15097 • Published Nov 19, 2025