linzhao-amd's picture
Update README.md
eb42dad verified
|
raw
history blame
1.56 kB
metadata
license: apache-2.0
metrics:
  - accuracy
base_model:
  - mistralai/Mixtral-8x7B-Instruct-v0.1

Quark Team FP8 Mixtral-8x7B Model Overview

Model Information For MLPerf

  • Model Name: Mixtral-7x8b
  • Version: MLPerf v5.1
  • Commit: Close Division Commit

Calibration Dataset

The calibration dataset consists of 1024 mixed datasets provided by MLPerf, which includes:

  • 325 GSM8k samples
  • 325 MBXP samples
  • 374 OpenOcra samples

Quantized Tensors

The following tensors are quantized in each decoder:

  • Expert MLP Inputs and Weights (excluding the router)
  • Linear qkv Inputs and Weight
  • KV Cache Entries

Ignored Layers

The following layers are ignored during quantization:

  • *.gate
  • *.o_proj
  • lm_head

Model Performance Comparison

Metric Baseline Accuracy Target (%) FP8 Quant Accuracy (%)
GSM8K (Math) 73.66 73.18 (99.34%)
Open Orca (Chat)
- Rouge1 45.5989 45.4362 (99.64%)
- Rouge2 23.3526 23.168 (99.21%)
- RougeL 30.4608 30.2922 (99.45%)
MBXP (Code) 60.16 60.08 (99.87%)

License

Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.