Update README.md
Browse files
README.md
CHANGED
|
@@ -56,7 +56,7 @@ pinned: false
|
|
| 56 |
<span class="link-text">Online Demo</span>
|
| 57 |
</a> |
|
| 58 |
<a href="https://github.com/imoneoi/openchat">
|
| 59 |
-
<img src="https://camo.githubusercontent.com/
|
| 60 |
<span class="link-text">GitHub</span>
|
| 61 |
</a> |
|
| 62 |
<a href="https://arxiv.org/pdf/2309.11235.pdf">
|
|
@@ -69,51 +69,32 @@ pinned: false
|
|
| 69 |
</a>
|
| 70 |
</p>
|
| 71 |
|
|
|
|
| 72 |
|
| 73 |
-
|
| 74 |
-
<div style="background-color: white; padding: 0.7em; border-radius: 0.5em; color: black; display: flex; flex-direction: column; justify-content: center; text-align: center; ont-size: 0.5em;">
|
| 75 |
-
<a href="https://huggingface.co/openchat/openchat-3.5-1210" style="text-decoration: none; color: black;">
|
| 76 |
-
<span style="font-size: 1.7em; font-family: 'Helvetica'; letter-spacing: 0.1em; font-weight: bold; color: black;">OPENCHAT</span><span style="font-size: 1.8em; font-family: 'Helvetica'; color: #3c72db; ">3.5</span>
|
| 77 |
-
<span style="font-size: 0.7em; font-family: 'Helvetica'; color: white; vertical-align: top; background-color:red; border-radius: 6em; padding: 0.066em 0.4em; letter-spacing: 0.1em; font-weight: bold;">1210</span>
|
| 78 |
-
<span style="font-size: 0.85em; font-family: 'Helvetica'; color: black;">
|
| 79 |
-
<br> π The Overall Best Performing Open Source 7B Model π
|
| 80 |
-
<br> π€ Outperforms <span style="font-weight: bold;">ChatGPT</span> (March) and <span style="font-weight: bold;">Grok-1</span> π€
|
| 81 |
-
<br> π<span style="font-size: 1em; font-family: 'Helvetica'; color: black; font-weight: bold;">15</span>-point improvement in Coding over <span style="font-size: 0.9em;
|
| 82 |
-
font-family: 'Helvetica'; color: black; font-weight: bold;">OpenChat-3.5π</span>
|
| 83 |
-
<br><br><span style="font-size: 1em; font-family: 'Helvetica'; color: #3c72db; font-weight: bold;">New Features</span>
|
| 84 |
-
<br> π‘ 2 Modes: Coding + Generalist, Mathematical Reasoning π‘
|
| 85 |
-
<br> π§ββοΈ Experimental support for Evaluator and Feedback capabilities π§ββοΈ
|
| 86 |
-
</span>
|
| 87 |
-
</a>
|
| 88 |
-
</div>
|
| 89 |
-
|
| 90 |
-
<div style="display: flex; justify-content: center; align-items: center">
|
| 91 |
-
<img src="https://github.com/alpayariyak/openchat/blob/master/assets/1210bench.png?raw=true" style="width: 100%; border-radius: 1em">
|
| 92 |
-
</div>
|
| 93 |
|
| 94 |
-
|
| 95 |
-
<img src="https://github.com/alpayariyak/openchat/blob/master/assets/logo_nobg.png?raw=true" alt="OpenChat Logo" style="width:20px; vertical-align: middle; display: inline-block; margin-right: 5px; margin-left: 0px; margin-top: 0px; margin-bottom: 0px;"/>About OpenChat
|
| 96 |
-
</h1>
|
| 97 |
|
| 98 |
-
-
|
| 99 |
-
- Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with `ChatGPT`, even with a `7B` model which can be run on a **consumer GPU (e.g. RTX 3090)**.
|
| 100 |
-
- Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision.
|
| 101 |
|
| 102 |
-
|
|
|
|
|
|
|
| 103 |
|
| 104 |
-
- [2023/12/10]
|
| 105 |
|
| 106 |
-
- [2023/11/01]
|
| 107 |
|
| 108 |
-
- [2023/09/21]
|
| 109 |
|
| 110 |
# π Benchmarks
|
| 111 |
-
|
| 112 |
| Model | # Params | Average | MT-Bench | HumanEval | BBH MC | AGIEval | TruthfulQA | MMLU | GSM8K | BBH CoT |
|
| 113 |
|--------------------|----------|----------|--------------|-----------------|----------|----------|---------------|--------------|--------------|-------------|
|
| 114 |
-
| OpenChat-3.5-
|
|
|
|
|
|
|
| 115 |
| OpenChat-3.5 | **7B** | 61.6 | 7.81 | 55.5 | 47.6 | 47.4 | 59.1 | 64.3 | **77.3** | 63.5 |
|
| 116 |
-
| ChatGPT (March)* | ? | 61.5 | **7.94** | 48.1 | 47.6 | 47.1 | 57.7 |
|
| 117 |
| | | | | | | | | | | |
|
| 118 |
| OpenHermes 2.5 | 7B | 59.3 | 7.54 | 48.2 | 49.4 | 46.5 | 57.5 | 63.8 | 73.5 | 59.9 |
|
| 119 |
| OpenOrca Mistral | 7B | 52.7 | 6.86 | 38.4 | 49.4 | 42.9 | 45.9 | 59.3 | 59.1 | 58.1 |
|
|
@@ -123,10 +104,11 @@ pinned: false
|
|
| 123 |
|
| 124 |
| | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
|
| 125 |
|-------------------|-------------|---------|----------|------|-----------|----------|----------|
|
| 126 |
-
| OpenChat
|
| 127 |
-
| OpenChat 3.5
|
|
|
|
| 128 |
| Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
|
| 129 |
-
| Grok-1 | Proprietary | ???B | 55.8 | 73 | 63.2 | 23.9 | 62.9 |
|
| 130 |
|
| 131 |
# πContact
|
| 132 |
|
|
|
|
| 56 |
<span class="link-text">Online Demo</span>
|
| 57 |
</a> |
|
| 58 |
<a href="https://github.com/imoneoi/openchat">
|
| 59 |
+
<img src="https://camo.githubusercontent.com/582429992c94328783a1509030dfd344c5845fb94be4a7b85fcf8e70b686e1b1/68747470733a2f2f6564656e742e6769746875622e696f2f537570657254696e7949636f6e732f696d616765732f706e672f6769746875622e706e67" alt="GitHub Logo" style="width:20px; vertical-align: middle; display: inline-block; margin-right: 5px; margin-left: 10px; margin-top: 0px; margin-bottom: 0px;"/>
|
| 60 |
<span class="link-text">GitHub</span>
|
| 61 |
</a> |
|
| 62 |
<a href="https://arxiv.org/pdf/2309.11235.pdf">
|
|
|
|
| 69 |
</a>
|
| 70 |
</p>
|
| 71 |
|
| 72 |
+
OpenChat is dedicated to advancing and releasing **open-source language models**, fine-tuned with our [**C-RLFT**](https://arxiv.org/pdf/2309.11235.pdf) technique, which is inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with `ChatGPT`, even with a `7B` model which can be run on a **consumer GPU (e.g. RTX 3090)**.
|
| 73 |
|
| 74 |
+
# π° News
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
|
| 76 |
+
- [2024/03/15] Nexusflow releases [Starling-Beta](https://huggingface.co/Nexusflow/Starling-LM-7B-beta), an RLHF-tune of openchat-3.5-1106, which is currently the highest ranking Open Source LLM on LMSys Arena not originating from a company, **beating all others at only 7B**.
|
|
|
|
|
|
|
| 77 |
|
| 78 |
+
- [2024/03/08] Released [OpenChat-3.5-0106-Gemma](https://huggingface.co/openchat/openchat-3.5-0106-gemma), the highest performing Gemma fine-tune.
|
|
|
|
|
|
|
| 79 |
|
| 80 |
+
- [2024/01/07] Released [OpenChat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106), trained with a new data pipeline - **the strongest 7B LLM in the world**.
|
| 81 |
+
- Ranked as the top 7B LLM on LMSys Arena.
|
| 82 |
+
- Ranked on LMSys Arena as the top open source LLM not originating from a company.
|
| 83 |
|
| 84 |
+
- [2023/12/10] Rleased [OpenChat-3.5-1210](https://huggingface.co/openchat/openchat-3.5-1210), 15-point improvements in coding.
|
| 85 |
|
| 86 |
+
- [2023/11/01] Released [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5), surpassing ChatGPT on various benchmarks π₯.
|
| 87 |
|
| 88 |
+
- [2023/09/21] Released our paper [OpenChat: Advancing Open-source Language Models with Mixed-Quality Data](https://arxiv.org/pdf/2309.11235.pdf).
|
| 89 |
|
| 90 |
# π Benchmarks
|
|
|
|
| 91 |
| Model | # Params | Average | MT-Bench | HumanEval | BBH MC | AGIEval | TruthfulQA | MMLU | GSM8K | BBH CoT |
|
| 92 |
|--------------------|----------|----------|--------------|-----------------|----------|----------|---------------|--------------|--------------|-------------|
|
| 93 |
+
| OpenChat-3.5-0106 | **7B** | **64.5** | 7.8 | **71.3** | 51.5 | 49.1 | 61.0 | **65.8** | 77.4 | 62.2 |
|
| 94 |
+
| OpenChat-3.5-0106-Gemma | **7B** | 64.4 | 7.83 | 67.7 | **52.7** | **50.2** | 55.4 | 65.7 | **81.5** | 63.7 |
|
| 95 |
+
| OpenChat-3.5-1210 | **7B** | 63.8 | 7.76 | 68.9 | 49.5 | 48.0 | **61.8** | 65.3 | 77.3 | 61.8 |
|
| 96 |
| OpenChat-3.5 | **7B** | 61.6 | 7.81 | 55.5 | 47.6 | 47.4 | 59.1 | 64.3 | **77.3** | 63.5 |
|
| 97 |
+
| ChatGPT (March)* | ? | 61.5 | **7.94** | 48.1 | 47.6 | 47.1 | 57.7 | 67.3 | 74.9 | **70.1** |
|
| 98 |
| | | | | | | | | | | |
|
| 99 |
| OpenHermes 2.5 | 7B | 59.3 | 7.54 | 48.2 | 49.4 | 46.5 | 57.5 | 63.8 | 73.5 | 59.9 |
|
| 100 |
| OpenOrca Mistral | 7B | 52.7 | 6.86 | 38.4 | 49.4 | 42.9 | 45.9 | 59.3 | 59.1 | 58.1 |
|
|
|
|
| 104 |
|
| 105 |
| | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
|
| 106 |
|-------------------|-------------|---------|----------|------|-----------|----------|----------|
|
| 107 |
+
| **OpenChat-3.5-0106** | Apache-2.0 | **7B** | **61.0** | 65.8 | **71.3** | **29.3** | **77.4** |
|
| 108 |
+
| OpenChat 3.5 1210 | Apache-2.0 | **7B** | 60.1 | 65.3 | 68.9 | 28.9 | 77.3 |
|
| 109 |
+
| OpenChat 3.5 | Apache-2.0 | **7B** | 56.4 | 64.3 | 55.5 | 28.6 | 77.3 |
|
| 110 |
| Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
|
| 111 |
+
| Grok-1 | Proprietary | ???B | 55.8 | **73** | 63.2 | 23.9 | 62.9 |
|
| 112 |
|
| 113 |
# πContact
|
| 114 |
|