Request: NVFP4 version of MiniMax-M2.5-REAP-139B (to fit on a single RTX 6000 Pro)
8
#7 opened 2 days ago
by
mondovero
VLLM error for kv weight scaling - workaround
7
#6 opened 5 days ago
by
ShaunEvansMD
Thanks for your effort
5
#5 opened 5 days ago
by
darkstar3537
fp8 kv cache
15
#4 opened 6 days ago
by
festr2
KeyError: '110.w1.input_scale' with TRT
2
#3 opened 6 days ago
by
guanwenyu1995
"w1_weight_scale_2 must match w3_weight_scale_2. Accuracy may be affected."
👍
1
19
#2 opened 7 days ago
by
zenmagnets
Here's the vLLM recipe I'm using with 2x RTX Pro 6000
👍
3
15
#1 opened 8 days ago
by
zenmagnets