Maozhou Ge's picture

Maozhou Ge

Gmc2

·

GHGmc2

AI & ML interests

None yet

Recent Activity

liked a Space 14 days ago

ISEEKYAN/megatron_memory_estimator

upvoted an article 14 days ago

Introduction to ggml

upvoted a paper 18 days ago

Hyper-Connections

View all activity

Organizations

None yet

upvoted an article 14 days ago

Article

Introduction to ggml

+1

Aug 13, 2024

•

261

upvoted a paper 18 days ago

Hyper-Connections

Paper • 2409.19606 • Published Sep 29, 2024 • 26

upvoted a paper about 2 months ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published Dec 1, 2025 • 100

upvoted a collection 2 months ago

LLaDA 2.0

7 items • Updated 28 days ago • 40

upvoted an article 2 months ago

Article

Finetune Stable Diffusion Models with DDPO via TRL

+2

Sep 29, 2023

•

20

upvoted a collection 3 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 21 days ago • 677

upvoted an article 3 months ago

Article

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

Aug 18, 2025

•

31

upvoted 3 collections 3 months ago

InternVL3.5-Core

This collection includes only the InternVL3.5 checkpoints that have completed the full training pipeline (i.e., Pretraining, SFT, MPO, Cascade RL). • 30 items • Updated Sep 28, 2025 • 12

Nemotron-Pre-Training-Datasets

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated about 20 hours ago • 94

Inference Optimized Checkpoints (with Model Optimizer)

A collection of generative models quantized and optimized for inference with Model Optimizer. • 47 items • Updated about 20 hours ago • 73

upvoted an article 3 months ago

Article

Fixing Gradient Accumulation

+4

Oct 16, 2024

•

65

upvoted a paper 4 months ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 18

upvoted a collection 4 months ago

DeepSeek-V3.2

4 items • Updated Dec 1, 2025 • 516

upvoted a paper 4 months ago

Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16, 2025 • 117

upvoted a collection 4 months ago

Qwen3-VL

37 items • Updated 21 days ago • 591

upvoted an article 5 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

Aug 9, 2025

•

77

upvoted a paper 6 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

upvoted a collection 6 months ago

Qwen3

84 items • Updated 21 days ago • 1.59k

upvoted a paper 6 months ago

Pre-Trained Policy Discriminators are General Reward Models

Paper • 2507.05197 • Published Jul 7, 2025 • 39

upvoted an article 7 months ago

Article

Mixture of Depth is Vibe

Apr 22, 2024

•

48