mikasenghaas commited on
Commit
0b5cbb2
·
verified ·
1 Parent(s): 03b3349

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -11,10 +11,19 @@ base_model:
11
 
12
  <!-- Provide a quick summary of what the model is/does. -->
13
 
14
- A debug model fine-tuned on 128 token context on `PrimeIntellect/Reverse-Text-SFT`. To be used as warmed up model to RL in `vf-reverse-text`.
15
 
16
- Created with this training command from [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) (commit hash: `ed25704`)
17
 
18
  ```bash
19
- uv run sft @ configs/reverse_text/sft.toml --ckpt
20
- ```
 
 
 
 
 
 
 
 
 
 
11
 
12
  <!-- Provide a quick summary of what the model is/does. -->
13
 
14
+ A debug model fine-tuned on `willcb/R1-reverse-wikipedia-paragraphs-v1-1000`. To be used as warmed up model to RL in `vf-reverse-text`.
15
 
16
+ Created with this training command from [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) (commit hash: `8262560`)
17
 
18
  ```bash
19
+ uv run torchrun --nproc-per-node 8 src/prime_rl/trainer/sft/train.py \
20
+ --model.name PrimeIntellect/Qwen3-0.6B \
21
+ --data.name willcb/R1-reverse-wikipedia-paragraphs-v1-1000 \
22
+ --max-steps 100 \
23
+ --data.batch-size 16 \
24
+ --data.micro-batch-size 1 \
25
+ --data.seq-len 4096 \
26
+ --optim.lr 2e-5
27
+ ```
28
+
29
+ Check the run out on [W&B](https://wandb.ai/primeintellect/mika/runs/odsfiekx?nw=nwusermikasenghaas_).