27 86 261

Yinxu Pan

cppowboy

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

new activity 1 day ago

zai-org/GLM-4.7-Flash:unsupport glm4-moe-lite

liked a model 2 days ago

openbmb/AgentCPM-Report

liked a model 2 days ago

openbmb/AgentCPM-Explore

View all activity

Organizations

New activity in zai-org/GLM-4.7-Flash 1 day ago

unsupport glm4-moe-lite

#25 opened 1 day ago by

cppowboy

liked 2 models 2 days ago

openbmb/AgentCPM-Report

8B • Updated 2 days ago • 92 • 186

openbmb/AgentCPM-Explore

Text Generation • 4B • Updated 4 days ago • 2.69k • 387

upvoted 2 papers 7 days ago

Ministral 3

Paper • 2601.08584 • Published 9 days ago • 45

DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published 8 days ago • 121

New activity in IQuestLab/IQuest-Coder-V1-40B-Instruct 8 days ago

swe不是80多分么，怎么降到76了？

#6 opened 18 days ago by

liuzhen1101

liked a dataset 10 days ago

camel-ai/seta-env

Viewer • Updated 10 days ago • 3 • 1.31k • 6

upvoted a paper 12 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 14 days ago • 203

upvoted a paper 16 days ago

Recursive Language Models

Paper • 2512.24601 • Published 23 days ago • 73

upvoted 3 papers 17 days ago

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Paper • 2601.00664 • Published 20 days ago • 54

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Paper • 2512.24615 • Published 23 days ago • 115

Deep Delta Learning

Paper • 2601.00417 • Published 21 days ago • 32

liked 2 datasets 18 days ago

Mxode/Chinese-Instruct

Viewer • Updated May 9, 2025 • 4.85M • 1.11k • 141

princeton-nlp/SWE-bench_Lite

Viewer • Updated Mar 3, 2025 • 323 • 37.2k • 52

upvoted 2 papers 22 days ago

Evaluating Parameter Efficient Methods for RLVR

Paper • 2512.23165 • Published 25 days ago • 26

End-to-End Test-Time Training for Long Context

Paper • 2512.23675 • Published 24 days ago • 20

upvoted a paper 23 days ago

Nested Browser-Use Learning for Agentic Information Seeking

Paper • 2512.23647 • Published 24 days ago • 18

upvoted 2 papers 25 days ago

SWE-RM: Execution-free Feedback For Software Engineering Agents

Paper • 2512.21919 • Published 27 days ago • 10

MAI-UI Technical Report: Real-World Centric Foundation GUI Agents

Paper • 2512.22047 • Published 27 days ago • 27

upvoted a paper 28 days ago

SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Paper • 2512.18470 • Published Dec 20, 2025 • 11