Memory Requirements to run `Qwen/Qwen3.5-397B-A17B`

#20
by alvarobartt - opened

Hey all,

See below the visual output of hf-mem on the estimated memory required to load Qwen/Qwen3.5-397B-A17B and run the inference, including the KV cache estimation.

uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B --experimental --kv-cache-dtype fp8

image

Let me know if that's useful! πŸ€—

@alvarobartt Can you please help us by running the same for FP-8 variant ?

Hey @saireddy I just did! But note that https://github.com/alvarobartt/hf-mem is open-source so you can run those yourself as e.g. uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B-FP8 --experimental --kv-cache-dtype fp8, let me know if you have any issue πŸ€—

https://huggingface.co/Qwen/Qwen3.5-397B-A17B-FP8/discussions/6

Sign up or log in to comment