Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training Paper β’ 2508.14904 β’ Published Aug 12 β’ 2
Efficient Switchable Safety Control in LLMs via Magic-Token-Guided Co-Training Paper β’ 2508.14904 β’ Published Aug 12 β’ 2
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper β’ 2506.04734 β’ Published Jun 5 β’ 20 β’ 3
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper β’ 2506.04734 β’ Published Jun 5 β’ 20
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper β’ 2506.04734 β’ Published Jun 5 β’ 20
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper β’ 2506.04734 β’ Published Jun 5 β’ 20 β’ 3
Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance Paper β’ 2502.12459 β’ Published Feb 18 β’ 2
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper β’ 2503.04872 β’ Published Mar 6 β’ 15
Expand VSR Benchmark for VLLM to Expertize in Spatial Rules Paper β’ 2412.18224 β’ Published Dec 24, 2024
LongAttn: Selecting Long-context Training Data via Token-level Attention Paper β’ 2502.16860 β’ Published Feb 24 β’ 1