Chain-of-Thought or Chain-of-Mimicry? The Over-SFT problem in Nanbeige 4.1-3B aka "I_Should_X"
1
#26 opened 1 day ago
by
srs6901
Very impressive for 3b.
👍
4
#24 opened 2 days ago
by
crownelius
What should Top-k be set to?
👀
1
1
#23 opened 2 days ago
by
daydreamwarrior
Inference broken with Jan
👀
🚀
3
#22 opened 2 days ago
by
redaihf
vLLM openai docker - tool config
1
#21 opened 2 days ago
by
johner420
anybody know the size of the context window for Nanbeige4.1-3B?
1
#20 opened 3 days ago
by
test333333
you got us hooked now , cant wait for the release of the 4.2 version . could you please provide any ETA or approximations about when it MIGHT release ?
➕
1
3
#17 opened 4 days ago
by
Why-T
Training Data and inference scripts with tool calling , websearch and so on plus training scripts
❤️
1
2
#16 opened 4 days ago
by
snapo
Any Plans for an Instruct Model?
🤗
🔥
6
5
#15 opened 5 days ago
by
Ashacorporation
this model thinks a lot
➕
👍
4
2
#14 opened 5 days ago
by
shihab456321
🚀 Starting Agent Loop Tool Efficiency Test
😎
❤️
7
#13 opened 5 days ago
by
bukit
Model "thinks" for too long
👍
3
5
#12 opened 5 days ago
by
Moisha1985
Это что такое? Это вообще реально?
🤗
❤️
6
4
#11 opened 6 days ago
by
Dword
Will the training code and datasets used to open-sourced?
🚀
7
#10 opened 7 days ago
by
Sourajit123
sglang inference
❤️
2
3
#9 opened 7 days ago
by
owao
Add evaluation results for GPQA, HLE
1
#8 opened 7 days ago
by
SaylorTwift
Very Impressive!
❤️
14
8
#7 opened 8 days ago
by
cob05
[Bug] Jinja template parse failure
2
#6 opened 8 days ago
by
iwr-redmond
Insane performance
❤️
9
3
#4 opened 9 days ago
by
AntDX316
When an AI Model Solves College-Level Math and Physics — On a Phone
❤️
7
18
#2 opened 9 days ago
by
Javedalam