Qwen3-VL-32B-Instruct-heretic
Abliterated (uncensored) version of Qwen/Qwen3-VL-32B-Instruct, created using Heretic and converted to GGUF.
Abliteration Quality
| Metric | Value |
|---|---|
| Refusals | 6/100 |
| KL Divergence | 0.0660 |
| Rounds | 3 |
Lower refusals = fewer refused prompts. Lower KL divergence = closer to original model behavior.
Available Quantizations
| Quantization | File | Size |
|---|---|---|
| Q8_0 | Qwen3-VL-32B-Instruct-heretic-Q8_0.gguf | 32.43 GB |
| Q6_K | Qwen3-VL-32B-Instruct-heretic-Q6_K.gguf | 25.04 GB |
| Q4_K_M | Qwen3-VL-32B-Instruct-heretic-Q4_K_M.gguf | 18.40 GB |
Usage with llama.cpp (Recommended)
Note: Ollama (as of v0.16.x) has a known bug that crashes when loading Qwen3-VL models. Use llama.cpp directly for vision features.
Vision models require a separate multimodal projector (mmproj) file. Download the official mmproj from Qwen/Qwen3-VL-32B-Instruct-GGUF:
# Download mmproj
huggingface-cli download Qwen/Qwen3-VL-32B-Instruct-GGUF mmproj-Qwen3VL-32B-Instruct-F16.gguf
# Run with llama-server (OpenAI-compatible API)
llama-server \
-m Qwen3-VL-32B-Instruct-heretic-Q6_K.gguf \
--mmproj mmproj-Qwen3VL-32B-Instruct-F16.gguf \
-ngl 999
# Or use the CLI directly
llama-mtmd-cli \
-m Qwen3-VL-32B-Instruct-heretic-Q6_K.gguf \
--mmproj mmproj-Qwen3VL-32B-Instruct-F16.gguf \
--image photo.jpg \
-p "Describe this image." \
-ngl 999
Usage with Ollama (Text Only)
Ollama can load this model for text-only chat, but vision/image features will crash due to the bug linked above.
ollama run hf.co/ThalisAI/Qwen3-VL-32B-Instruct-heretic:Q8_0
ollama run hf.co/ThalisAI/Qwen3-VL-32B-Instruct-heretic:Q6_K
ollama run hf.co/ThalisAI/Qwen3-VL-32B-Instruct-heretic:Q4_K_M
About
This model was processed by the Apostate automated abliteration pipeline:
- The source model was loaded in bf16
- Heretic's optimization-based abliteration was applied to remove refusal behavior
- The merged model was converted to GGUF format using llama.cpp
- Multiple quantization levels were generated
The abliteration process uses directional ablation to remove the model's refusal directions while minimizing KL divergence from the original model's behavior on harmless prompts.
- Downloads last month
- 121
4-bit
6-bit
8-bit
Model tree for ThalisAI/Qwen3-VL-32B-Instruct-heretic
Base model
Qwen/Qwen3-VL-32B-Instruct