marksverdhei commited on
Commit
b1f1a3e
·
verified ·
1 Parent(s): e06b7e0

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ base_model: MiniMaxAI/MiniMax-M2.5
4
+ tags:
5
+ - gguf
6
+ - llama.cpp
7
+ - quantized
8
+ - moe
9
+ ---
10
+
11
+ # MiniMax-M2.5 GGUF
12
+
13
+ GGUF quantizations of [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5), created with [llama.cpp](https://github.com/ggerganov/llama.cpp).
14
+
15
+ ## Model Details
16
+
17
+ | Property | Value |
18
+ |----------|-------|
19
+ | **Base model** | MiniMaxAI/MiniMax-M2.5 |
20
+ | **Architecture** | Mixture of Experts (MoE) |
21
+ | **Total parameters** | 230B |
22
+ | **Active parameters** | 10B per token |
23
+ | **Layers** | 62 |
24
+ | **Total experts** | 256 |
25
+ | **Active experts per token** | 8 |
26
+ | **Source precision** | FP8 (`float8_e4m3fn`) |
27
+
28
+ ## Available Quantizations
29
+
30
+ | Quantization | Size | Description |
31
+ |-------------|------|-------------|
32
+ | Q8_0 | ~227 GB | 8-bit quantization, highest quality |
33
+ | Q4_K_M | — | 4-bit K-quant (medium), good balance of quality and size |
34
+ | IQ3_S | — | 3-bit importance quantization (small), compact |
35
+ | Q2_K | — | 2-bit K-quant, smallest size |
36
+
37
+ ## Usage
38
+
39
+ These GGUFs can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) and compatible frontends.
40
+
41
+ ```bash
42
+ # Example with llama-cli
43
+ llama-cli -m MiniMax-M2.5-Q4_K_M.gguf -p "Hello" -n 128
44
+ ```
45
+
46
+ ## Notes
47
+
48
+ - The source model uses FP8 (`float8_e4m3fn`) precision, so Q8_0 is effectively lossless relative to the source weights.
49
+ - This is a large MoE model. Even the smallest quant (Q2_K) requires significant memory due to the number of experts.
50
+ - Quantized from the official [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) weights.