Memory Requirements to run `Qwen/Qwen3.5-397B-A17B`

#20

by alvarobartt - opened 4 days ago

4 days ago

Hey all,

See below the visual output of hf-mem on the estimated memory required to load Qwen/Qwen3.5-397B-A17B and run the inference, including the KV cache estimation.

uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B --experimental --kv-cache-dtype fp8

Let me know if that's useful! 🤗

saireddy

1 day ago

@alvarobartt Can you please help us by running the same for FP-8 variant ?

alvarobartt

1 day ago

Hey @saireddy I just did! But note that https://github.com/alvarobartt/hf-mem is open-source so you can run those yourself as e.g. uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B-FP8 --experimental --kv-cache-dtype fp8, let me know if you have any issue 🤗

https://huggingface.co/Qwen/Qwen3.5-397B-A17B-FP8/discussions/6

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment