Yihua Zhang's picture

Yihua Zhang

NormalUhr

·

https://www.yihua-zhang.com

AI & ML interests

None yet

Organizations

published an article 5 months ago

Article

A Role Shift for AI Infra: From Foundational Support to a Core Engine of Innovation

Oct 3, 2025

•

1

published an article 7 months ago

Article

Re-understanding KL Approximation from an RL-for-LLM Lens: Notes on “Approximating KL Divergence”

Aug 11, 2025

•

8

published an article 7 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

84

published an article 9 months ago

Article

Decorators in Machine Learning

Jun 8, 2025

•

1

published an article 12 months ago

Article

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

Feb 28, 2025

•

16

published an article about 1 year ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11, 2025

•

107

published an article about 1 year ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

276

published an article about 1 year ago

Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

Feb 4, 2025

•

29

published an article about 1 year ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

Feb 4, 2025

•

17

published an article about 1 year ago

Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

Feb 4, 2025

•

21