·
AI & ML interests
None yet
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article A Role Shift for AI Infra: From Foundational Support to a Core Engine of Innovation
view article Re-understanding KL Approximation from an RL-for-LLM Lens: Notes on “Approximating KL Divergence”
view article DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background
published an
article about 1 year ago view article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment
published an
article about 1 year ago view article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge
published an
article about 1 year ago view article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons
published an
article about 1 year ago view article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning
published an
article about 1 year ago view article MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression