weiyao_ruc
weiweiruc
AI & ML interests
None yet
Recent Activity
upvoted a paper 11 days ago
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation upvoted a paper 24 days ago
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security upvoted a paper 4 months ago
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Organizations
None yet