Update README.md
Browse files
README.md
CHANGED
|
@@ -20,7 +20,7 @@ tags:
|
|
| 20 |
|
| 21 |
# Updates
|
| 22 |
|
| 23 |
-
- πππ [July 24, 2024] We now introduce [shenzhi-wang/Llama3.1-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat)!
|
| 24 |
- π₯ We provide the official **q4_k_m, q8_0, and f16 GGUF** versions of Llama3.1-8B-Chinese-Chat-**v2.1** at https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat/tree/main/gguf!
|
| 25 |
|
| 26 |
|
|
@@ -39,8 +39,6 @@ Developers: [Shenzhi Wang](https://shenzhi-wang.netlify.app)\*, [Yaowei Zheng](h
|
|
| 39 |
|
| 40 |
This is the first model specifically fine-tuned for Chinese & English users based on the [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct). The fine-tuning algorithm used is ORPO [1].
|
| 41 |
|
| 42 |
-
**Compared to the original [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct), our llama3.1-8B-Chinese-Chat model significantly reduces the issues of "Chinese questions with English answers" and the mixing of Chinese and English in responses.**
|
| 43 |
-
|
| 44 |
|
| 45 |
[1] Hong, Jiwoo, Noah Lee, and James Thorne. "Reference-free Monolithic Preference Optimization with Odds Ratio." arXiv preprint arXiv:2403.07691 (2024).
|
| 46 |
|
|
|
|
| 20 |
|
| 21 |
# Updates
|
| 22 |
|
| 23 |
+
- πππ [July 24, 2024] We now introduce [shenzhi-wang/Llama3.1-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat)! The training dataset contains >100K preference pairs, and it exhibits significant enhancements, especially in **roleplay**, **function calling**, and **math** capabilities!
|
| 24 |
- π₯ We provide the official **q4_k_m, q8_0, and f16 GGUF** versions of Llama3.1-8B-Chinese-Chat-**v2.1** at https://huggingface.co/shenzhi-wang/Llama3.1-8B-Chinese-Chat/tree/main/gguf!
|
| 25 |
|
| 26 |
|
|
|
|
| 39 |
|
| 40 |
This is the first model specifically fine-tuned for Chinese & English users based on the [Meta-Llama-3.1-8B-Instruct model](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct). The fine-tuning algorithm used is ORPO [1].
|
| 41 |
|
|
|
|
|
|
|
| 42 |
|
| 43 |
[1] Hong, Jiwoo, Noah Lee, and James Thorne. "Reference-free Monolithic Preference Optimization with Odds Ratio." arXiv preprint arXiv:2403.07691 (2024).
|
| 44 |
|