GGUFs are here. Tutorials to run locally
#7
by
alanzhuly
- opened
You can already run Qwen3-VL-4B & 8B locally on GPU/CPU/NPU using GGUF, MLX, and NexaML with NexaSDK (GitHub: https://github.com/NexaAI/nexa-sdk). Note that currently only NexaSDK supports this model's GGUF.
Check out GGUF, MLX, and NexaML models in this HuggingFace collection: https://huggingface.co/collections/NexaAI/qwen3vl-68d46de18fdc753a7295190a
Stop spamming GGUFs all over the place.
- Who needs, will create them by themselves
- Let’s just keep the original GGUFs at one copy per model. No need to have thousands of same model copies all over the HF!