Can we get a distilled 2B or 4B?

#18
by Abc7347 - opened

Given the impressive performance of the Qwen3.5 architecture, it would be incredibly helpful for the community to have smaller, distilled versions of this model (such as 2B or 4B parameters)

Thx! πŸ™πŸ»

i think we'll get 2B, 9B and 35B-A5B models in the coming days.

Can we get a distilled 2B or 4B?

There is a version of Claude distilled into a 4B model, i think Qwen/Qwen3.5-397B-A17B can be distilled into a 2-4B model.
Im just saying. I only distilled BERT, but nothing else. I might be wrong.

we could take qwen3-4b-2507 and distill it on a dataset made of qwen3.5 prompts

teich ai to the rescue!

I need a replacement for Qwen3-0.6B πŸ™„ β€” something more capable at role-playing and still high enough quality to run in llama.cpp with Q8_0.

Sign up or log in to comment