Model "thinks" for too long
they acknowledged its overthinking issue and they said they are working to fix it without the loss of performance . i hope its not going to take long cause it seems like the overthinking issue makes it sometimes unusable , specially in some riddles or problems it took over 25 minutes thinking which is overkill .
Hope they fix it soon, the performance is actually good for 3b model and I really like to use it but this issue needs to be resolved!
Let’s join first; winning is just a matter of time.
Did you try these parameters --temp 0.6 --top-p 0.95 --top-k 40 --min-p 0.01
From my observations, these might be best, and may help with thinking time
Did you try these parameters --temp 0.6 --top-p 0.95 --top-k 40 --min-p 0.01
From my observations, these might be best, and may help with thinking time
I’ve tried first two but not the last two. I’ll give it a try. Thanks
