YinZhiBin commited on
Commit
620b435
·
verified ·
1 Parent(s): 863a719

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -1
README.md CHANGED
@@ -34,7 +34,6 @@ VibeThinker-1.5B is a 1.5-billion parameter dense language model. With a total t
34
 
35
  ## Training Pipeline
36
 
37
-
38
  ![image](https://cdn-uploads.huggingface.co/production/uploads/64d1faaa1ed6649d70d1fa2f/rPfb1GKFQUOICcFs95Aus.png)
39
 
40
  VibeThinker-1.5B's core innovation lies in the "Spectrum-to-Signal Principle" (SSP) training framework: it first explores solution diversity during the Supervised Fine-Tuning (SFT) stage, and then optimizes its policy to reinforce correct signals in the Reinforcement Learning (RL) stage. By systematically integrating these two phases, our approach establishes diversity as the central technical design principle, enabling VibeThinker-1.5B to achieve robust performance that surpasses conventional training paradigms.
 
34
 
35
  ## Training Pipeline
36
 
 
37
  ![image](https://cdn-uploads.huggingface.co/production/uploads/64d1faaa1ed6649d70d1fa2f/rPfb1GKFQUOICcFs95Aus.png)
38
 
39
  VibeThinker-1.5B's core innovation lies in the "Spectrum-to-Signal Principle" (SSP) training framework: it first explores solution diversity during the Supervised Fine-Tuning (SFT) stage, and then optimizes its policy to reinforce correct signals in the Reinforcement Learning (RL) stage. By systematically integrating these two phases, our approach establishes diversity as the central technical design principle, enabling VibeThinker-1.5B to achieve robust performance that surpasses conventional training paradigms.