Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning Paper • 2509.22824 • Published Sep 26 • 20
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26 • 25
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks Paper • 2410.10563 • Published Oct 14, 2024 • 37
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks? Paper • 2410.20533 • Published Oct 27, 2024
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26 • 25
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1 • 75
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published May 26 • 19
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design Paper • 2505.16175 • Published May 22 • 41
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published May 20 • 24
ACECODER: Acing Coder RL via Automated Test-Case Synthesis Paper • 2502.01718 • Published Feb 3 • 29
Mantis-VL/intern_vl_25_llava_next_700k_pretrain_packing_4096 Feature Extraction • 9B • Updated Jan 9 • 3
Mantis-VL/qwen2-vl-video-eval_st_r2k_bad8k_49152_regression Text Classification • 8B • Updated Dec 22, 2024 • 9
Mantis-VL/qwen2-vl-video-eval_st_r2k_bad5k_49152_regression Text Classification • 8B • Updated Dec 22, 2024 • 10
Mantis-VL/qwen2-vl-video-eval_st_bad8k_49152_regression Text Classification • 8B • Updated Dec 22, 2024 • 9
Mantis-VL/qwen2-vl-video-eval_st_bad5k_49152_regression Text Classification • 8B • Updated Dec 19, 2024 • 14
Mantis-VL/qwen2-vl-video-eval_st_bad8k_55296_regression Text Classification • 8B • Updated Dec 19, 2024 • 9