File size: 5,996 Bytes
fa2796d 807421a 687e0d5 202e3a3 ed3cc6f faa762f 1257082 dbc405d 1257082 07ee6ab faa762f ed3cc6f 27af980 ed3cc6f 9b07e01 ed3cc6f 9b07e01 ed3cc6f fbe15ee ed3cc6f 863a719 de0bdcf ed3cc6f 1472722 ed3cc6f b129747 863a719 c53d58e 90804b8 1bfdd60 301e6b4 6a68026 1bfdd60 90804b8 0fb28e9 2dc3045 1a6bb81 2dc3045 c6f816c 0fb28e9 e93eb4c 9261948 c0c0a84 11cc575 9261948 c0c0a84 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
---
license: mit
language:
- en
base_model:
- Qwen/Qwen2.5-Math-1.5B
tags:
- math
- code
- gpqa
pipeline_tag: text-generation
library_name: transformers
---
# VibeThinker-1.5B
<blockquote style="border-left: 4px solid #ff6b6b; background-color: #fff5f5; padding: 10px 15px; margin: 10px 0; color: #cc3333;">
<span style="font-weight: bold;">🚨 </span> We recommend using this model for competitive-style math and algorithm coding problems(such as Leetcode,Codeforces,etc). It works better to ask the question in English. We do not advise using it for other tasks, as this is an experimental release aimed at exploring the reasoning capabilities of small models.
</blockquote>
<p align="center">📁 <a href="https://github.com/WeiboAI/VibeThinker">Github</a>   |   🤖 <a href="https://modelscope.cn/models/WeiboAI/VibeThinker-1.5B">Model Scope</a> |   📄 <a href="https://huggingface.co/papers/2511.06221">Techical Report</a></p>
## Introduction
VibeThinker-1.5B is a 1.5-billion parameter dense language model. With a total training cost of only $7,800 USD, it achieves reasoning performance comparable to larger models like GPT OSS-20B Medium.

## Key Performance Data
💡 Mathematical Reasoning: On the three major math benchmarks AIME24, AIME25, and HMMT25, its scores (80.3, 74.4, and 50.4, respectively) all surpass those of the initial DeepSeek R1 model, which has over 400 times the parameters (scores of 79.8, 70.0, and 41.7, respectively).
🌱 Code Generation: It achieved scores of 55.9 on LiveCodeBench v5 and 51.1 on v6. Its v6 score slightly leads Magistral Medium (50.3), underscoring its strong reasoning performance.

🔁 On the AIME 25 benchmark, VibeThinker-1.5B significantly extends the Pareto frontier of reasoning accuracy versus model scale, demonstrating that exceptional performance can be achieved with extreme parameter efficiency.

## Training Pipeline

VibeThinker-1.5B's core innovation lies in the "Spectrum-to-Signal Principle" (SSP) training framework: it first explores solution diversity during the Supervised Fine-Tuning (SFT) stage, and then optimizes its policy to reinforce correct signals in the Reinforcement Learning (RL) stage. By systematically integrating these two phases, our approach establishes diversity as the central technical design principle, enabling VibeThinker-1.5B to achieve robust performance that surpasses conventional training paradigms.
## Usage Guidelines
**We recommend using this model for competitive-style math and coding problems.**
To facilitate quick verification by the community, we recommend the following parameter settings: **temperature: 0.6 or 1.0, max token length: 40960, top_p: 0.95, top_k: -1.**
A more detailed evaluation scheme we have prepared can be found on [GitHub](https://github.com/WeiboAI/VibeThinker/tree/main/eval).
## Quick Start
Required: **transformers>=4.54.0**
Recommended for better inference performance: **vLLM==0.10.1 or SGLang>=0.4.9.post6**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
class VibeThinker:
def __init__(self, model_path):
self.model_path = model_path
self.model = AutoModelForCausalLM.from_pretrained(
self.model_path,
low_cpu_mem_usage=True,
torch_dtype="bfloat16",
device_map="auto"
)
self.tokenizer = AutoTokenizer.from_pretrained(self.model_path, trust_remote_code=True)
def infer_text(self, prompt):
messages = [
{"role": "user", "content": prompt}
]
text = self.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = self.tokenizer([text], return_tensors="pt").to(self.model.device)
text = self.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = self.tokenizer([text], return_tensors="pt").to(self.model.device)
generation_config = dict(
max_new_tokens=40960,
do_sample=True,
temperature=0.6, # 0.6 or 1.0, you can set it according to your needs
top_p=0.95,
top_k=None # in vLLM or SGlang, please set top_k to -1, it means skip top_k for sampling
)
generated_ids = self.model.generate(
**model_inputs,
generation_config=GenerationConfig(**generation_config)
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = self.tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
return response
if __name__ == '__main__':
model = VibeThinker('Your model path')
prompt = 'Your Prompt'
print(model.infer_text(prompt))
```
## License
The model repository is licensed under the MIT License.
## Citations & References
If you use VibeThinker in your research or product, please cite:
```
@misc{xu2025tinymodelbiglogic,
title={Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B},
author={Sen Xu and Yi Zhou and Wei Wang and Jixin Min and Zhibin Yin and Yingwei Dai and Shixi Liu and Lianyu Pang and Yirong Chen and Junlin Zhang},
year={2025},
eprint={2511.06221},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2511.06221},
}
``` |