ZhaofengZhang-AMD commited on
Commit
19c1d68
·
verified ·
1 Parent(s): aa4fdc5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -3
README.md CHANGED
@@ -1,3 +1,45 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ metrics:
4
+ - accuracy
5
+ base_model:
6
+ - mistralai/Mixtral-8x7B-Instruct-v0.1
7
+ ---
8
+ # Quark Team FP8 Mixtral-8x7B Model Overview
9
+
10
+ ## Model Information For MLPerf
11
+ - **Model Name**: Mixtral-7x8b
12
+ - **Version**: MLPerf v5.0
13
+ - **Commit**: Close Division Commit
14
+
15
+ ## Calibration Dataset
16
+ The calibration dataset consists of **1024 mixed datasets** provided by MLPerf, which includes:
17
+ - **325 GSM8k samples**
18
+ - **325 MBXP samples**
19
+ - **374 OpenOcra samples**
20
+
21
+ ## Quantized Tensors
22
+ The following tensors are quantized in each decoder:
23
+ - **Expert MLP Inputs and Weights** (excluding the router)
24
+ - **Linear qkv Inputs and Weight**
25
+ - **KV Cache Entries**
26
+
27
+ ## Ignored Layers
28
+ The following layers are ignored during quantization:
29
+ - `*.gate`
30
+ - `*.o_proj`
31
+ - `lm_head`
32
+
33
+ # Model Performance Comparison
34
+
35
+ | Metric | Baseline Accuracy Target (%) | FP8 Quant Accuracy (%) |
36
+ |-----------------------|--------------------|-----------------------|
37
+ | **GSM8K (Math)** | 73.66 | 73.18 (99.34%) |
38
+ | **Open Orca (Chat)** | | |
39
+ | - Rouge1 | 45.5989 | 45.4362 (99.64%) |
40
+ | - Rouge2 | 23.3526 | 23.168 (99.21%) |
41
+ | - RougeL | 30.4608 | 30.2922 (99.45%) |
42
+ | **MBXP (Code)** | 60.16 | 60.08 (99.87%) |
43
+
44
+
45
+