GGUFs are here. Tutorials to run locally

#7
by alanzhuly - opened

You can already run Qwen3-VL-4B & 8B locally on GPU/CPU/NPU using GGUF, MLX, and NexaML with NexaSDK (GitHub: https://github.com/NexaAI/nexa-sdk). Note that currently only NexaSDK supports this model's GGUF.

Check out GGUF, MLX, and NexaML models in this HuggingFace collection: https://huggingface.co/collections/NexaAI/qwen3vl-68d46de18fdc753a7295190a

Stop spamming GGUFs all over the place.

  1. Who needs, will create them by themselves
  2. Let’s just keep the original GGUFs at one copy per model. No need to have thousands of same model copies all over the HF!

Sign up or log in to comment