soape_1.2b_1 / README.md
KaiyueWen's picture
Upload folder using huggingface_hub
6691fe6 verified

Model Card

Best configuration

Hyperparameter Value
beta1 0.95
beta2 0.99
block_size 512
epsilon 1e-10
learning_rate 0.004
max_grad_norm 1
min_lr_ratio 0.0
precondition_frequency 10
shampoo_beta 0.9
train_batch_size 256
warmup 1000
weight_decay 0.1