Add link to Neuron-optimized version
#14
by
badaoui
HF Staff
- opened
README.md
CHANGED
|
@@ -107,4 +107,16 @@ If you find our work helpful, feel free to give us a cite.
|
|
| 107 |
journal={arXiv preprint arXiv:2407.10671},
|
| 108 |
year={2024}
|
| 109 |
}
|
| 110 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
journal={arXiv preprint arXiv:2407.10671},
|
| 108 |
year={2024}
|
| 109 |
}
|
| 110 |
+
```
|
| 111 |
+
|
| 112 |
+
---
|
| 113 |
+
## 🚀 AWS Neuron Optimized Version Available
|
| 114 |
+
|
| 115 |
+
A Neuron-optimized version of this model is available for improved performance on AWS Inferentia/Trainium instances:
|
| 116 |
+
|
| 117 |
+
**[badaoui/Qwen-Qwen2.5-1.5B-Instruct-neuron](https://huggingface.co/badaoui/Qwen-Qwen2.5-1.5B-Instruct-neuron)**
|
| 118 |
+
|
| 119 |
+
The Neuron-optimized version provides:
|
| 120 |
+
- Pre-compiled artifacts for faster loading
|
| 121 |
+
- Optimized performance on AWS Neuron devices
|
| 122 |
+
- Same model capabilities with improved inference speed
|