Update README.md
Browse filesadd note about weights compression parameters
README.md
CHANGED
|
@@ -10,6 +10,16 @@ language:
|
|
| 10 |
## Description
|
| 11 |
This is [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model converted to [OpenVINO](https://docs.openvino.ai/2024/home.html) Intermediate Representation (IR) format with INT4 compressed weights using [NNCF](https://github.com/openvinotoolkit/nncf).
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
## Compatibility
|
| 14 |
|
| 15 |
This provided IR is compatible with openvino starting with 2024.0.0 version and optimum-intel 1.16.0
|
|
|
|
| 10 |
## Description
|
| 11 |
This is [Mixtral-8x7b-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) model converted to [OpenVINO](https://docs.openvino.ai/2024/home.html) Intermediate Representation (IR) format with INT4 compressed weights using [NNCF](https://github.com/openvinotoolkit/nncf).
|
| 12 |
|
| 13 |
+
## Quantization Configuration
|
| 14 |
+
|
| 15 |
+
Model weights was compressed to INT4 precision using `nncf.compress_weights` with the following parameters:
|
| 16 |
+
|
| 17 |
+
* mode: **INT4_SYM**
|
| 18 |
+
* group_size: **128**
|
| 19 |
+
* ratio: **0.8**
|
| 20 |
+
|
| 21 |
+
More details about optimization parameters can be found in [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html)
|
| 22 |
+
|
| 23 |
## Compatibility
|
| 24 |
|
| 25 |
This provided IR is compatible with openvino starting with 2024.0.0 version and optimum-intel 1.16.0
|