will there be a smaller version?

#16

by iojvsuynv - opened 4 days ago

Discussion

iojvsuynv

4 days ago

A huge, powerful model is great, but what about smaller models for local use?

RecViking

4 days ago

•

edited 4 days ago

I'm sure folks will create REAP/REAM versions of this model if they haven't already uploaded them here somewhere. With that said, I doubt you'll get REAP/REAM+Quants that will fit in something like 32gigs of VMEM without some terrible loss in competency. TBH, I expect GPUs on the consumer market to eventually catch up to the "need" of something like this. High end workstation cards are already close to running a REAP/REAM+Quant of this model.

I know that doesn't directly answer your question. I also expect qwen might release smaller 3.5 versions of this architecture, but I doubt they'll have the same competency of this model. As with many previous models, they release the large model, then distill it to create smaller models. This isn't cheap. I'm just thankful Qwen does it. If they DON'T create smaller versions/distillations of this model, I'm sure someone else will.

hihenry69

4 days ago

@RecViking I just bought an rtx pro 6000. at least following the current memory pricing and market, not holding my breath that even with a new gen of gpu from team green that that will mean more vram... one can hope however.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment