exdysa commited on
Commit
adaba4b
·
verified ·
1 Parent(s): e2c5b0c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -37,9 +37,15 @@ tags:
37
  - 4-bit
38
  ```
39
 
40
- MiniMax-M2.5-REAP-172B-A10B-GGUF-Q4 is a 172 billion parameter MiniMax M2.5 model with 25% of its experts pruned with REAP (Router-weighted Expert Activation Pruning), then converted to GGUF with llama.cpp and static Q4 quantized.
41
 
42
- Patch: Reuploaded quantization from `llama.cpp` main@8110 `gguf` @0.17.1 On initial push testing on M4 device and Ollama the model rambled compared to M2.1-REAP. Source used to convert,`llama.cpp` main@7952 quantization.
 
 
 
 
 
 
43
 
44
  Command sequence using source version of llama.cpp from source and `ports` llama-quantize:
45
 
 
37
  - 4-bit
38
  ```
39
 
40
+ # MiniMax-M2.5-REAP-172B-A10B-GGUF-Q4
41
 
42
+ This is a 172 billion parameter MiniMax M2.5 model with 25% of its experts pruned with REAP (Router-weighted Expert Activation Pruning), then converted to GGUF with llama.cpp and static Q4 quantized.
43
+
44
+ > [!NOTE]
45
+ > ## Patched 20 / 02 /26
46
+ > Reuploaded quantization from `llama.cpp` main@8110 `gguf` @0.17.1.
47
+ > On initial push testing on M4 device and Ollama the model rambled compared to M2.1-REAP.
48
+ > Original conversion,`llama.cpp` main@7952 quantization.
49
 
50
  Command sequence using source version of llama.cpp from source and `ports` llama-quantize:
51