Can we get a distilled 2B or 4B?

#18

by Abc7347 - opened 4 days ago

4 days ago

Given the impressive performance of the Qwen3.5 architecture, it would be incredibly helpful for the community to have smaller, distilled versions of this model (such as 2B or 4B parameters)

Thx! 🙏🏻

aquiffoo

4 days ago

i think we'll get 2B, 9B and 35B-A5B models in the coming days.

FlameF0X

4 days ago

Can we get a distilled 2B or 4B?

There is a version of Claude distilled into a 4B model, i think Qwen/Qwen3.5-397B-A17B can be distilled into a 2-4B model.
Im just saying. I only distilled BERT, but nothing else. I might be wrong.

aquiffoo

4 days ago

we could take qwen3-4b-2507 and distill it on a dataset made of qwen3.5 prompts

teich ai to the rescue!

FlameF0X

4 days ago

true

yousef1727

4 days ago

I need a replacement for Qwen3-0.6B 🙄 — something more capable at role-playing and still high enough quality to run in llama.cpp with Q8_0.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment