File size: 4,263 Bytes
a483bb4
 
ed98d17
a483bb4
ed98d17
bba6abe
e762563
 
 
 
c8b7a66
e762563
 
 
 
 
ed98d17
 
e762563
 
 
 
 
 
ed98d17
e762563
 
 
 
 
ed98d17
e762563
ed98d17
e762563
ed98d17
 
 
 
e762563
 
 
 
 
 
 
 
ed98d17
e762563
 
 
 
ed98d17
e762563
 
 
 
 
 
 
 
 
 
 
 
 
 
ed98d17
e762563
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54c8ce3
6cec92f
e762563
 
ed98d17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
---
license: apache-2.0

---

# JT-Math-8B-Base



<p align="center">
    <a href="https://www.arxiv.org/abs/2507.19748" target="_blank">
        <img src="https://img.shields.io/badge/Paper-ArXiv-red">
    </a>
    <a href="https://huggingface.co/JT-LM/JT-Math-8B-Base" target="_blank">
        <img src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Models-blue">
    </a>
        <a href="https://www.modelscope.cn/models/JiuTian-AI/JT-Math-8B-Base" target="_blank">
        <img src="https://img.shields.io/badge/%F0%9F%A4%96%20ModelScope-Models-blue">
    </a>
</p>




We are excited to introduce JT-Math-8B-Base: an 8-billion-parameter foundation model engineered for mathematical reasoning and the cornerstone of the JT-Math family. JT-Math-8B-Base was pre-trained on top of JT-Coder-8B-Base using an additional 210 billion tokens of high-quality mathematical and general-domain data. With a native 32,768-token context window, it provides a robust, scalable, and reproducible foundation for downstream fine-tuning—enabling researchers and developers to advance the frontier of math-centric AI applications. Technical details, training recipes, and reproducibility notes are available in our technical report.





## Model Downloads

We release the following models to support a wide range of applications.

| Model Name          | Context Length | Hugging Face Link                                          | ModelScope Link                                            | Notes                                                      |
| ------------------- | -------------- | ---------------------------------------------------------- | ---------------------------------------------------------- | ---------------------------------------------------------- |
| JT-Math-8B-Base     | 32K            |  [Link](https://huggingface.co/JT-LM/JT-Math-8B-Base)     |  [Link](https://www.modelscope.cn/models/JiuTian-AI/JT-Math-8B-Base) | The foundational base model. Ideal for custom fine-tuning. |
------




## Evaluation Results

| Model                       | GSM8K | Math  | CMath (zh) | Average |
| --------------------------- | ----- | ----- | ---------- | ------- |
| Qwen2.5-Base-32B            | 92.8  | 57.7  | 85.4       | 78.6    |
| Llama-3.1-Base-405B         | 89.0  | 53.8  | 77.4       | 73.4    |
| DeepSeek-Math-Base-7B       | 64.2  | 36.2  | 71.7       | 57.4    |
| DeepSeek-Coder-V2-Lite-Base | 68.3  | 38.1  | 77.8       | 61.4    |
| Qwen2.5-Math-7B             | 91.6  | 55.4  | 85.0       | 77.3    |
| *JT-Math-8B-Base*           | 87.5  | 60.1  | 90.2       | *79.2*  |







## How to Get Started

We provide a basic example of how to run inference with the `JT-Math-8B-Base` model.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "JT-LM/JT-Math-8B-Base"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

prompt = "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?"
text = f"Question:\n{prompt}\nAnswer:\n"
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

gen_kwargs = {
    "do_sample": False,
    "max_new_tokens": 8192,
}
generated_ids = model.generate(
    **model_inputs,
    **gen_kwargs
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

response = tokenizer.decode(output_ids, skip_special_tokens=True)
print("response:", response)
```





## Citation



If you find our work useful, please consider citing our paper:

```latex
@article{jiutian-math2025,
  title={JIUTIAN MATH: A MULTI-STAGE FRAMEWORK FOR ADVANCED MATHEMATICAL REASONING IN LARGE LANGUAGE MODELS},
  author={Yifan Hao, Fangning Chao, Yaqian Hao, Zhaojun Cui, Huan Bai, Haiyu Zhang, Yankai Liu, Chao Deng, Junlan Feng},
  journal={arXiv:2507.19748},
  year={2025}
}
```