Memory Requirements to run `Qwen/Qwen3.5-397B-A17B`
#20
by
alvarobartt
- opened
Hey all,
See below the visual output of hf-mem on the estimated memory required to load Qwen/Qwen3.5-397B-A17B and run the inference, including the KV cache estimation.
uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B --experimental --kv-cache-dtype fp8
Let me know if that's useful! π€
Hey
@saireddy
I just did! But note that https://github.com/alvarobartt/hf-mem is open-source so you can run those yourself as e.g. uvx hf-mem --model-id Qwen/Qwen3.5-397B-A17B-FP8 --experimental --kv-cache-dtype fp8, let me know if you have any issue π€
https://huggingface.co/Qwen/Qwen3.5-397B-A17B-FP8/discussions/6
