Safetensors
qwen2
justus27 commited on
Commit
918aeac
·
verified ·
1 Parent(s): af214ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -0
README.md CHANGED
@@ -12,3 +12,30 @@ The model was trained using [prime-rl](https://github.com/PrimeIntellect-ai/prim
12
 
13
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a32edf17b9f57eaec2ea65/0NFEBL9eAObkU4IQ_hAo0.png)
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a32edf17b9f57eaec2ea65/0NFEBL9eAObkU4IQ_hAo0.png)
14
 
15
+
16
+ ## Model Information
17
+
18
+ - Training Dataset (verifiable math & coding tasks): [PrimeIntellect/Intellect-2-RL-Dataset](https://huggingface.co/PrimeIntellect/Intellect-2-RL-Dataset)
19
+ - Base Model: [QwQ-32B](https://huggingface.co/Qwen/QwQ-32B)
20
+ - Training Code: [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl)
21
+
22
+ ## Usage
23
+
24
+ INTELLECT-2 is based on the `qwen2` architecture, making it compatible with popular libraries and inference engines such as [vllm](https://github.com/vllm-project/vllm) or [sglang](https://github.com/sgl-project/sglang).
25
+
26
+ Given that INTELLECT-2 was trained with a length control budget, you will achieve the best results by appending the prompt `"Think for 10000 tokens before giving a response."` to your instruction. As reported in our technical report, the model did not train for long enough to fully learn the length control objective, which is why results won't differ strongly if you specify lengths other than 10,000. If you wish to do so, you can expect the best results with 2000, 4000, 6000 and 8000, as these were the other target lengths present during training.
27
+
28
+ ## Performance
29
+
30
+ During training, INTELLECT-2 improved upon QwQ in its mathematical and coding abilities.
31
+
32
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a32edf17b9f57eaec2ea65/4k_Nmj2g8MqC7I6ORIkMH.png)
33
+
34
+ | **Model** | **AIME24** | **AIME25** | **LiveCodeBench (v5)** | **GPQA-Diamond** | **IFEval** |
35
+ | ------------------- | ---------- | ---------- | ---------------------- | ---------------- | ---------- |
36
+ | *INTELLECT* | **78.8** | 64.9 | **67.8** | 66.8 | 81.5 |
37
+ | QwQ-32B | 76.6 | 64.8 | 66.1 | 66.3 | 83.4 |
38
+ | Qwen-R1-Distill-32B | 69.9 | 58.4 | 55.1 | 65.2 | 72.0 |
39
+ | Deepseek-R1 | 78.6 | 65.1 | 64.1 | 71.6 | 82.7 |
40
+
41
+