File size: 505 Bytes
919a505 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
# Model Card
- Source: [https://arxiv.org/abs/2509.02046](https://arxiv.org/abs/2509.02046)
- Optimizer: `scion`
- Model size: `520m`
- Data size: `21B`
## Best configuration
| Hyperparameter | Value |
|---|---|
| beta1 | `0.98` |
| decay | `1` |
| learning_rate | `0.004` |
| lr_schedule | `linear` |
| max_grad_norm | `2` |
| min_lr_ratio | `0` |
| momentum | `0.95` |
| scion_epsilon | `1e-05` |
| scion_to_signum_lr | `0.1` |
| train_batch_size | `128` |
| warmup | `0` |
| weight_decay | `0.1` |
|