Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
marin-community
's Collections
All Marin ISOFlops
DCLM Baseline ISOFlop Models
CommonPile ISOFlop Models
Nemotron ISOFlop Models
Fantastic Optimizers Open-sourced Models
Fantastic Optimizers Open-sourced Models
updated
Oct 28
Best-tuned model for each setting for https://arxiv.org/abs/2509.02046;
Upvote
-
OptimizerStudy/adamw_1.2b_1
2B
•
Updated
Oct 23
•
6
OptimizerStudy/adamw_1.2b_2
2B
•
Updated
Oct 23
•
6
OptimizerStudy/adamw_1.2b_4
2B
•
Updated
Oct 23
•
3
OptimizerStudy/adamw_1.2b_8
2B
•
Updated
Oct 23
•
8
OptimizerStudy/adamw_130m_1
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/adamw_130m_16
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_130m_2
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_130m_4
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_130m_8
0.3B
•
Updated
Oct 23
•
8
OptimizerStudy/adamw_300m_1
0.5B
•
Updated
Oct 23
•
6
OptimizerStudy/adamw_300m_16
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/adamw_300m_2
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/adamw_300m_4
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_300m_8
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_520m_1
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_520m_2
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/adamw_520m_4
0.8B
•
Updated
Oct 23
•
3
OptimizerStudy/adamw_520m_8
0.8B
•
Updated
Oct 23
•
69
OptimizerStudy/cautious_130m_1
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/cautious_130m_2
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/cautious_130m_4
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/cautious_130m_8
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_300m_1
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_300m_2
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_300m_4
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_300m_8
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/cautious_520m_1
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_520m_2
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_520m_4
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/cautious_520m_8
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_130m_1
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/kron_130m_2
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_130m_4
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_130m_8
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_300m_1
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/kron_300m_2
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_300m_4
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_300m_8
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/kron_520m_1
0.8B
•
Updated
Oct 23
•
5
OptimizerStudy/kron_520m_2
0.8B
•
Updated
Oct 23
•
5
OptimizerStudy/kron_520m_4
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/kron_520m_8
0.8B
•
Updated
Oct 23
•
5
OptimizerStudy/lion_130m_1
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_130m_2
0.3B
•
Updated
Oct 23
•
7
OptimizerStudy/lion_130m_4
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_130m_8
0.3B
•
Updated
Oct 23
•
2
OptimizerStudy/lion_300m_1
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/lion_300m_2
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_300m_4
0.5B
•
Updated
Oct 23
•
6
OptimizerStudy/lion_300m_8
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_520m_1
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_520m_2
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_520m_4
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/lion_520m_8
0.8B
•
Updated
Oct 23
•
3
OptimizerStudy/mars_130m_1
0.3B
•
Updated
Oct 23
•
6
OptimizerStudy/mars_130m_2
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/mars_130m_4
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/mars_130m_8
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/mars_300m_1
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/mars_300m_2
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/mars_300m_4
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/mars_300m_8
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/mars_520m_1
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/mars_520m_2
0.8B
•
Updated
Oct 23
•
5
OptimizerStudy/mars_520m_4
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/mars_520m_8
0.8B
•
Updated
Oct 23
•
5
OptimizerStudy/mini_130m_1
0.3B
•
Updated
Oct 23
•
5
OptimizerStudy/mini_130m_2
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_130m_4
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_130m_8
0.3B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_300m_1
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_300m_2
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_300m_4
0.5B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_300m_8
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/mini_520m_1
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_520m_2
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_520m_4
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/mini_520m_8
0.8B
•
Updated
Oct 23
•
3
OptimizerStudy/muon_1.2b_1
2B
•
Updated
Oct 23
•
5
OptimizerStudy/muon_1.2b_2
2B
•
Updated
Oct 23
•
6
OptimizerStudy/muon_1.2b_4
2B
•
Updated
Oct 23
•
4
OptimizerStudy/muon_1.2b_8
2B
•
Updated
Oct 23
•
7
OptimizerStudy/muon_130m_1
0.3B
•
Updated
Oct 23
•
6
OptimizerStudy/muon_130m_16
0.3B
•
Updated
Oct 23
•
6
OptimizerStudy/muon_130m_2
0.3B
•
Updated
Oct 23
•
8
OptimizerStudy/muon_130m_4
0.3B
•
Updated
Oct 23
•
7
OptimizerStudy/muon_130m_8
0.3B
•
Updated
Oct 23
•
8
OptimizerStudy/muon_300m_1
0.5B
•
Updated
Oct 23
•
5
OptimizerStudy/muon_300m_2
0.5B
•
Updated
Oct 23
•
6
OptimizerStudy/muon_300m_4
0.5B
•
Updated
Oct 23
•
3
OptimizerStudy/muon_300m_8
0.5B
•
Updated
Oct 23
•
6
OptimizerStudy/muon_520m_1
0.8B
•
Updated
Oct 23
•
5
OptimizerStudy/muon_520m_2
0.8B
•
Updated
Oct 23
•
4
OptimizerStudy/muon_520m_4
0.8B
•
Updated
Oct 24
•
3
OptimizerStudy/muon_520m_8
0.8B
•
Updated
Oct 24
•
70
OptimizerStudy/nadamw_1.2b_1
2B
•
Updated
Oct 24
•
3
OptimizerStudy/nadamw_1.2b_2
2B
•
Updated
Oct 24
•
5
OptimizerStudy/nadamw_1.2b_4
2B
•
Updated
Oct 24
•
3
OptimizerStudy/nadamw_1.2b_8
2B
•
Updated
Oct 24
•
6
OptimizerStudy/nadamw_130m_1
0.3B
•
Updated
Oct 24
•
7
OptimizerStudy/nadamw_130m_16
0.3B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_130m_2
0.3B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_130m_4
0.3B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_130m_8
0.3B
•
Updated
Oct 24
•
7
OptimizerStudy/nadamw_300m_1
0.5B
•
Updated
Oct 24
•
5
OptimizerStudy/nadamw_300m_16
0.5B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_300m_2
0.5B
•
Updated
Oct 24
•
5
OptimizerStudy/nadamw_300m_4
0.5B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_300m_8
0.5B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_520m_1
0.8B
•
Updated
Oct 24
•
5
OptimizerStudy/nadamw_520m_2
0.8B
•
Updated
Oct 24
•
6
OptimizerStudy/nadamw_520m_4
0.8B
•
Updated
Oct 24
•
4
OptimizerStudy/nadamw_520m_8
0.8B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_130m_1
0.3B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_130m_2
0.3B
•
Updated
Oct 24
•
3
OptimizerStudy/scion_130m_4
0.3B
•
Updated
Oct 24
•
5
OptimizerStudy/scion_130m_8
0.3B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_300m_1
0.5B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_300m_2
0.5B
•
Updated
Oct 24
•
5
OptimizerStudy/scion_300m_4
0.5B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_300m_8
0.5B
•
Updated
Oct 24
•
3
OptimizerStudy/scion_520m_1
0.8B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_520m_2
0.8B
•
Updated
Oct 24
•
3
OptimizerStudy/scion_520m_4
0.8B
•
Updated
Oct 24
•
4
OptimizerStudy/scion_520m_8
0.8B
•
Updated
Oct 24
•
3
OptimizerStudy/soape_1.2b_1
2B
•
Updated
Oct 24
•
5
OptimizerStudy/soape_1.2b_2
2B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_1.2b_4
2B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_1.2b_8
2B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_130m_1
0.3B
•
Updated
Oct 24
•
6
OptimizerStudy/soape_130m_16
0.3B
•
Updated
Oct 24
•
6
OptimizerStudy/soape_130m_2
0.3B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_130m_4
0.3B
•
Updated
Oct 24
•
5
OptimizerStudy/soape_130m_8
0.3B
•
Updated
Oct 24
•
6
OptimizerStudy/soape_300m_1
0.5B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_300m_16
0.5B
•
Updated
Oct 24
•
6
OptimizerStudy/soape_300m_2
0.5B
•
Updated
Oct 24
•
6
OptimizerStudy/soape_300m_4
0.5B
•
Updated
Oct 24
•
5
OptimizerStudy/soape_300m_8
0.5B
•
Updated
Oct 24
•
5
OptimizerStudy/soape_520m_1
0.8B
•
Updated
Oct 24
•
5
OptimizerStudy/soape_520m_2
0.8B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_520m_4
0.8B
•
Updated
Oct 24
•
4
OptimizerStudy/soape_520m_8
0.8B
•
Updated
Oct 24
•
68
OptimizerStudy/sophia_130m_1
0.3B
•
Updated
Oct 24
•
6
OptimizerStudy/sophia_130m_2
0.3B
•
Updated
Oct 24
•
5
OptimizerStudy/sophia_130m_4
0.3B
•
Updated
Oct 24
•
6
OptimizerStudy/sophia_130m_8
0.3B
•
Updated
Oct 24
•
6
OptimizerStudy/sophia_300m_1
0.5B
•
Updated
Oct 24
•
7
OptimizerStudy/sophia_520m_1
0.8B
•
Updated
Oct 24
•
6
Upvote
-
Share collection
View history
Collection guide
Browse collections