metadata
license: apache-2.0
metrics:
- accuracy
base_model:
- mistralai/Mixtral-8x7B-Instruct-v0.1
Quark Team FP8 Mixtral-8x7B Model Overview
Model Information For MLPerf
- Model Name: Mixtral-7x8b
- Version: MLPerf v5.1
- Commit: Close Division Commit
Calibration Dataset
The calibration dataset consists of 1024 mixed datasets provided by MLPerf, which includes:
- 325 GSM8k samples
- 325 MBXP samples
- 374 OpenOcra samples
Quantized Tensors
The following tensors are quantized in each decoder:
- Expert MLP Inputs and Weights (excluding the router)
- Linear qkv Inputs and Weight
- KV Cache Entries
Ignored Layers
The following layers are ignored during quantization:
*.gate*.o_projlm_head
Model Performance Comparison
| Metric | Baseline Accuracy Target (%) | FP8 Quant Accuracy (%) |
|---|---|---|
| GSM8K (Math) | 73.66 | 73.18 (99.34%) |
| Open Orca (Chat) | ||
| - Rouge1 | 45.5989 | 45.4362 (99.64%) |
| - Rouge2 | 23.3526 | 23.168 (99.21%) |
| - RougeL | 30.4608 | 30.2922 (99.45%) |
| MBXP (Code) | 60.16 | 60.08 (99.87%) |
License
Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.