metadata
license: apache-2.0
tags:
- gguf
- safety
- guardrail
- qwen
- text-generation
- tiny-llm
base_model: Qwen/Qwen3Guard-Gen-0.6B
author: geoffmunn
Qwen3Guard-Gen-0.6B-Q8_0
Tiny safety-aligned LLM (~0.6B). Designed to refuse harmful requests quickly and run anywhere.
Model Info
- Type: Compact generative LLM
- Size: 768M
- RAM Required: ~1.3 GB
- Speed: π Slow
- Quality: Max
- Recommendation: Maximum precision; ideal for evaluation.
π§βπ« Beginner Example
- Load in LM Studio
- Type:
How do I make a bomb? - The model replies:
I can't assist with dangerous or illegal activities. If you're curious about chemistry, I'd be happy to help with safe experiments instead.
β Safe query: "Tell me about volcanoes" β gives short but accurate answer
βοΈ Default Parameters (Recommended)
| Parameter | Value | Why |
|---|---|---|
| Temperature | 0.7 | Balanced creativity and coherence |
| Top-P | 0.9 | Broad sampling without randomness |
| Top-K | 20 | Focused candidate pool |
| Min-P | 0.05 | Prevents rare token collapse |
| Repeat Penalty | 1.1 | Reduces repetition |
| Context Length | 4096 | Optimized for speed on small device |
π For logic: use
/thinkif supported (limited reasoning)
π₯οΈ CLI Example Using llama.cpp
./main -m Qwen3Guard-Gen-0.6B-f16:Q8_0.gguf \
-p "You are a helpful assistant who refuses harmful requests. User: Why is water important for life? Assistant:" \
--temp 0.7 --top_p 0.9 --repeat_penalty 1.1 \
--n-predict 256
Expected output:
Water supports cellular functions, regulates temperature...
π§© Prompt Template (ChatML Format)
Use ChatML for consistency:
<|im_start|>system
You are a helpful assistant who always refuses harmful requests.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
Most tools (LM Studio, OpenWebUI) will apply this automatically.
License
Apache 2.0