lmmy commited on
Commit
4faf150
·
verified ·
1 Parent(s): 3479f38

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. README.md +29 -0
  3. chat_template.jinja +159 -0
  4. config.json +606 -0
  5. generation_config.json +9 -0
  6. model-00001-of-00047.safetensors +3 -0
  7. model-00002-of-00047.safetensors +3 -0
  8. model-00003-of-00047.safetensors +3 -0
  9. model-00004-of-00047.safetensors +3 -0
  10. model-00006-of-00047.safetensors +3 -0
  11. model-00007-of-00047.safetensors +3 -0
  12. model-00008-of-00047.safetensors +3 -0
  13. model-00010-of-00047.safetensors +3 -0
  14. model-00012-of-00047.safetensors +3 -0
  15. model-00013-of-00047.safetensors +3 -0
  16. model-00014-of-00047.safetensors +3 -0
  17. model-00016-of-00047.safetensors +3 -0
  18. model-00017-of-00047.safetensors +3 -0
  19. model-00018-of-00047.safetensors +3 -0
  20. model-00019-of-00047.safetensors +3 -0
  21. model-00020-of-00047.safetensors +3 -0
  22. model-00021-of-00047.safetensors +3 -0
  23. model-00022-of-00047.safetensors +3 -0
  24. model-00023-of-00047.safetensors +3 -0
  25. model-00024-of-00047.safetensors +3 -0
  26. model-00025-of-00047.safetensors +3 -0
  27. model-00026-of-00047.safetensors +3 -0
  28. model-00027-of-00047.safetensors +3 -0
  29. model-00028-of-00047.safetensors +3 -0
  30. model-00029-of-00047.safetensors +3 -0
  31. model-00030-of-00047.safetensors +3 -0
  32. model-00031-of-00047.safetensors +3 -0
  33. model-00032-of-00047.safetensors +3 -0
  34. model-00033-of-00047.safetensors +3 -0
  35. model-00034-of-00047.safetensors +3 -0
  36. model-00035-of-00047.safetensors +3 -0
  37. model-00036-of-00047.safetensors +3 -0
  38. model-00037-of-00047.safetensors +3 -0
  39. model-00038-of-00047.safetensors +3 -0
  40. model-00039-of-00047.safetensors +3 -0
  41. model-00040-of-00047.safetensors +3 -0
  42. model-00041-of-00047.safetensors +3 -0
  43. model-00042-of-00047.safetensors +3 -0
  44. model-00043-of-00047.safetensors +3 -0
  45. model-00044-of-00047.safetensors +3 -0
  46. model-00045-of-00047.safetensors +3 -0
  47. model-00046-of-00047.safetensors +3 -0
  48. model-00047-of-00047.safetensors +3 -0
  49. model.safetensors.index.json +0 -0
  50. tokenizer.json +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ license: other
4
+ license_name: modified-mit
5
+ license_link: https://github.com/MiniMax-AI/MiniMax-M2.5/blob/main/LICENSE
6
+ library_name: transformers
7
+ tags:
8
+ - mlx
9
+ base_model: MiniMaxAI/MiniMax-M2.5
10
+ ---
11
+ ## 💫 Community Model> MiniMax-M2.5 by MiniMaxAI
12
+
13
+ _👾 [LM Studio](https://lmstudio.ai) Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on [Discord](https://discord.gg/aPQfnNkxGC)_.
14
+
15
+ **Model creator**: [MiniMaxAI](https://huggingface.co/MiniMaxAI)<br>
16
+ **Original model**: [MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5)<br>
17
+ **MLX quantization**: provided by [LM Studio team](https://x.com/lmstudio) using [mlx_lm](https://github.com/ml-explore/mlx-lm)<br>
18
+
19
+ ## Technical Details
20
+
21
+ 8-bit quantized version of MiniMax-M2.5 using MLX, optimized for Apple Silicon.
22
+
23
+ ## Special thanks
24
+
25
+ 🙏 Special thanks to the [Apple Machine Learning Research](https://github.com/ml-explore) team for creating [MLX](https://github.com/ml-explore/mlx).
26
+
27
+ ## Disclaimers
28
+
29
+ LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.
chat_template.jinja ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {# ----------‑‑‑ special token variables ‑‑‑---------- #}
2
+ {%- set toolcall_begin_token = '<minimax:tool_call>' -%}
3
+ {%- set toolcall_end_token = '</minimax:tool_call>' -%}
4
+ {#- Tool Rendering Functions ============================================== -#}
5
+ {%- macro render_tool_namespace(namespace_name, tool_list) -%}
6
+ {%- for tool in tool_list -%}
7
+ <tool>{{ tool.function | tojson(ensure_ascii=False) }}</tool>
8
+ {% endfor -%}
9
+ {%- endmacro -%}
10
+ {%- macro visible_text(content) -%}
11
+ {%- if content is string -%}
12
+ {{ content }}
13
+ {%- elif content is iterable and content is not mapping -%}
14
+ {%- for item in content -%}
15
+ {%- if item is mapping and item.type == 'text' -%}
16
+ {{- item.text }}
17
+ {%- elif item is string -%}
18
+ {{- item }}
19
+ {%- endif -%}
20
+ {%- endfor -%}
21
+ {%- else -%}
22
+ {{- content }}
23
+ {%- endif -%}
24
+ {%- endmacro -%}
25
+ {#- System Message Construction ============================================ -#}
26
+ {%- macro build_system_message(system_message) -%}
27
+ {%- if system_message and system_message.content -%}
28
+ {{- visible_text(system_message.content) }}
29
+ {%- else -%}
30
+ {%- if model_identity is not defined -%}
31
+ {%- set model_identity = "You are a helpful assistant. Your name is MiniMax-M2.5 and is built by MiniMax." -%}
32
+ {%- endif -%}
33
+ {{- model_identity }}
34
+ {%- endif -%}
35
+
36
+ {#- Handle current_date -#}
37
+ {%- if system_message and system_message.current_date -%}
38
+ {{- '\n' ~ 'Current date: ' + system_message.current_date }}
39
+ {%- endif -%}
40
+ {#- Handle current_location -#}
41
+ {%- if system_message and system_message.current_location -%}
42
+ {{- '\n' ~ 'Current location: ' + system_message.current_location }}
43
+ {%- endif -%}
44
+ {%- endmacro -%}
45
+ {#- Main Template Logic ================================================= -#}
46
+ {#- Extract system message (only first message if it's system) -#}
47
+ {%- set system_message = none -%}
48
+ {%- set conversation_messages = messages -%}
49
+ {%- if messages and messages[0].role == "system" -%}
50
+ {%- set system_message = messages[0] -%}
51
+ {%- set conversation_messages = messages[1:] -%}
52
+ {%- endif -%}
53
+ {#- Get the last user message turn, for interleved thinking -#}
54
+ {%- set ns = namespace(last_user_index=-1) %}
55
+ {% for m in conversation_messages %}
56
+ {%- if m.role == 'user' %}
57
+ {% set ns.last_user_index = loop.index0 -%}
58
+ {%- endif %}
59
+ {%- endfor %}
60
+ {#- Render system message -#}
61
+ {{- ']~!b[' ~ ']~b]system' ~ '\n' }}
62
+ {{- build_system_message(system_message) }}
63
+ {#- Render tools if available -#}
64
+ {%- if tools -%}
65
+ {{- '\n\n' ~ '# Tools' ~ '\n' ~ 'You may call one or more tools to assist with the user query.\nHere are the tools available in JSONSchema format:' ~ '\n' }}
66
+ {{- '\n' ~ '<tools>' ~ '\n' }}
67
+ {{- render_tool_namespace("functions", tools) }}
68
+ {{- '</tools>' ~ '\n\n' }}
69
+ {{- 'When making tool calls, use XML format to invoke tools and pass parameters:' ~ '\n' }}
70
+ {{- '\n' ~ toolcall_begin_token }}
71
+ <invoke name="tool-name-1">
72
+ <parameter name="param-key-1">param-value-1</parameter>
73
+ <parameter name="param-key-2">param-value-2</parameter>
74
+ ...
75
+ </invoke>
76
+ {{- '\n' ~ toolcall_end_token }}
77
+ {%- endif -%}
78
+ {{- '[e~[\n' }}
79
+
80
+ {#- Render messages -#}
81
+ {%- set last_tool_call = namespace(name=none) -%}
82
+ {%- for message in conversation_messages -%}
83
+ {%- if message.role == 'assistant' -%}
84
+ {#- Only render reasoning_content if no user message follows -#}
85
+ {{- ']~b]ai' ~ '\n' }}
86
+
87
+ {%- set reasoning_content = '' %}
88
+ {%- set content = visible_text(message.content) %}
89
+ {%- if message.reasoning_content is string %}
90
+ {%- set reasoning_content = message.reasoning_content %}
91
+ {%- else %}
92
+ {%- if '</think>' in content %}
93
+ {%- set reasoning_content = content.split('</think>')[0].strip('\n').split('<think>')[-1].strip('\n') %}
94
+ {%- set content = content.split('</think>')[-1].strip('\n') %}
95
+ {%- endif %}
96
+ {%- endif %}
97
+ {%- if reasoning_content and loop.index0 > ns.last_user_index -%}
98
+ {{- '<think>' ~ '\n' ~ reasoning_content ~ '\n' ~ '</think>' ~ '\n\n' }}
99
+ {%- endif -%}
100
+ {%- if content -%}
101
+ {{- content }}
102
+ {%- endif -%}
103
+ {%- if message.tool_calls -%}
104
+ {{- '\n' ~ toolcall_begin_token ~ '\n' }}
105
+
106
+ {%- for tool_call in message.tool_calls -%}
107
+ {%- if tool_call.function %}
108
+ {%- set tool_call = tool_call.function %}
109
+ {%- endif %}
110
+ {{- '<invoke name="' + tool_call.name + '">' }}
111
+ {% set _args = tool_call.arguments %}
112
+ {%- for k, v in _args.items() %}
113
+ {{- '<parameter name="' + k + '">' }}
114
+ {{- v | tojson(ensure_ascii=False) if v is not string else v }}
115
+ {{- '</parameter>' }}
116
+ {% endfor %}
117
+ {{- '</invoke>' ~ '\n' }}
118
+ {%- endfor -%}
119
+
120
+ {{- toolcall_end_token}}
121
+ {%- set last_tool_call.name = message.tool_calls[-1].name -%}
122
+ {%- else -%}
123
+ {%- set last_tool_call.name = none -%}
124
+ {%- endif -%}
125
+ {{- '[e~[' ~ '\n' }}
126
+
127
+ {%- elif message.role == 'tool' -%}
128
+ {%- if last_tool_call.name is none -%}
129
+ {{- raise_exception("Message has tool role, but there was no previous assistant message with a tool call!") }}
130
+ {%- endif -%}
131
+ {%- if loop.first or (conversation_messages[loop.index0 - 1].role != 'tool') -%}
132
+ {{- ']~b]tool' }}
133
+ {%- endif -%}
134
+ {%- if message.content is string -%}
135
+ {{- '\n<response>' }}
136
+ {{- message.content }}
137
+ {{- '</response>' }}
138
+ {%- else -%}
139
+ {%- for tr in message.content -%}
140
+ {{- '\n<response>' }}
141
+ {{- tr.output if tr.output is defined else (tr.text if tr.type == 'text' and tr.text is defined else tr) }}
142
+ {{- '\n</response>' }}
143
+ {%- endfor -%}
144
+ {%- endif -%}
145
+ {%- if loop.last or (conversation_messages[loop.index0 + 1].role != 'tool') -%}
146
+ {{- '[e~[\n' -}}
147
+ {%- endif -%}
148
+
149
+ {%- elif message.role == 'user' -%}
150
+ {{- ']~b]user' ~ '\n' }}
151
+ {{- visible_text(message.content) }}
152
+ {{- '[e~[' ~ '\n' }}
153
+ {%- endif -%}
154
+ {%- endfor -%}
155
+
156
+ {#- Generation prompt -#}
157
+ {%- if add_generation_prompt -%}
158
+ {{- ']~b]ai' ~ '\n' ~ '<think>' ~ '\n' }}
159
+ {%- endif -%}
config.json ADDED
@@ -0,0 +1,606 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MiniMaxM2ForCausalLM"
4
+ ],
5
+ "attn_type_list": [
6
+ 1,
7
+ 1,
8
+ 1,
9
+ 1,
10
+ 1,
11
+ 1,
12
+ 1,
13
+ 1,
14
+ 1,
15
+ 1,
16
+ 1,
17
+ 1,
18
+ 1,
19
+ 1,
20
+ 1,
21
+ 1,
22
+ 1,
23
+ 1,
24
+ 1,
25
+ 1,
26
+ 1,
27
+ 1,
28
+ 1,
29
+ 1,
30
+ 1,
31
+ 1,
32
+ 1,
33
+ 1,
34
+ 1,
35
+ 1,
36
+ 1,
37
+ 1,
38
+ 1,
39
+ 1,
40
+ 1,
41
+ 1,
42
+ 1,
43
+ 1,
44
+ 1,
45
+ 1,
46
+ 1,
47
+ 1,
48
+ 1,
49
+ 1,
50
+ 1,
51
+ 1,
52
+ 1,
53
+ 1,
54
+ 1,
55
+ 1,
56
+ 1,
57
+ 1,
58
+ 1,
59
+ 1,
60
+ 1,
61
+ 1,
62
+ 1,
63
+ 1,
64
+ 1,
65
+ 1,
66
+ 1,
67
+ 1
68
+ ],
69
+ "auto_map": {
70
+ "AutoConfig": "configuration_minimax_m2.MiniMaxM2Config",
71
+ "AutoModelForCausalLM": "modeling_minimax_m2.MiniMaxM2ForCausalLM"
72
+ },
73
+ "eos_token_id": 200020,
74
+ "head_dim": 128,
75
+ "hidden_act": "silu",
76
+ "hidden_size": 3072,
77
+ "intermediate_size": 1536,
78
+ "max_position_embeddings": 196608,
79
+ "model_type": "minimax_m2",
80
+ "mtp_transformer_layers": 1,
81
+ "num_attention_heads": 48,
82
+ "num_experts_per_tok": 8,
83
+ "num_hidden_layers": 62,
84
+ "num_key_value_heads": 8,
85
+ "num_local_experts": 256,
86
+ "num_mtp_modules": 3,
87
+ "qk_norm_type": "per_layer",
88
+ "quantization": {
89
+ "group_size": 64,
90
+ "bits": 8,
91
+ "mode": "affine",
92
+ "model.layers.0.block_sparse_moe.gate": {
93
+ "group_size": 64,
94
+ "bits": 8
95
+ },
96
+ "model.layers.1.block_sparse_moe.gate": {
97
+ "group_size": 64,
98
+ "bits": 8
99
+ },
100
+ "model.layers.2.block_sparse_moe.gate": {
101
+ "group_size": 64,
102
+ "bits": 8
103
+ },
104
+ "model.layers.3.block_sparse_moe.gate": {
105
+ "group_size": 64,
106
+ "bits": 8
107
+ },
108
+ "model.layers.4.block_sparse_moe.gate": {
109
+ "group_size": 64,
110
+ "bits": 8
111
+ },
112
+ "model.layers.5.block_sparse_moe.gate": {
113
+ "group_size": 64,
114
+ "bits": 8
115
+ },
116
+ "model.layers.6.block_sparse_moe.gate": {
117
+ "group_size": 64,
118
+ "bits": 8
119
+ },
120
+ "model.layers.7.block_sparse_moe.gate": {
121
+ "group_size": 64,
122
+ "bits": 8
123
+ },
124
+ "model.layers.8.block_sparse_moe.gate": {
125
+ "group_size": 64,
126
+ "bits": 8
127
+ },
128
+ "model.layers.9.block_sparse_moe.gate": {
129
+ "group_size": 64,
130
+ "bits": 8
131
+ },
132
+ "model.layers.10.block_sparse_moe.gate": {
133
+ "group_size": 64,
134
+ "bits": 8
135
+ },
136
+ "model.layers.11.block_sparse_moe.gate": {
137
+ "group_size": 64,
138
+ "bits": 8
139
+ },
140
+ "model.layers.12.block_sparse_moe.gate": {
141
+ "group_size": 64,
142
+ "bits": 8
143
+ },
144
+ "model.layers.13.block_sparse_moe.gate": {
145
+ "group_size": 64,
146
+ "bits": 8
147
+ },
148
+ "model.layers.14.block_sparse_moe.gate": {
149
+ "group_size": 64,
150
+ "bits": 8
151
+ },
152
+ "model.layers.15.block_sparse_moe.gate": {
153
+ "group_size": 64,
154
+ "bits": 8
155
+ },
156
+ "model.layers.16.block_sparse_moe.gate": {
157
+ "group_size": 64,
158
+ "bits": 8
159
+ },
160
+ "model.layers.17.block_sparse_moe.gate": {
161
+ "group_size": 64,
162
+ "bits": 8
163
+ },
164
+ "model.layers.18.block_sparse_moe.gate": {
165
+ "group_size": 64,
166
+ "bits": 8
167
+ },
168
+ "model.layers.19.block_sparse_moe.gate": {
169
+ "group_size": 64,
170
+ "bits": 8
171
+ },
172
+ "model.layers.20.block_sparse_moe.gate": {
173
+ "group_size": 64,
174
+ "bits": 8
175
+ },
176
+ "model.layers.21.block_sparse_moe.gate": {
177
+ "group_size": 64,
178
+ "bits": 8
179
+ },
180
+ "model.layers.22.block_sparse_moe.gate": {
181
+ "group_size": 64,
182
+ "bits": 8
183
+ },
184
+ "model.layers.23.block_sparse_moe.gate": {
185
+ "group_size": 64,
186
+ "bits": 8
187
+ },
188
+ "model.layers.24.block_sparse_moe.gate": {
189
+ "group_size": 64,
190
+ "bits": 8
191
+ },
192
+ "model.layers.25.block_sparse_moe.gate": {
193
+ "group_size": 64,
194
+ "bits": 8
195
+ },
196
+ "model.layers.26.block_sparse_moe.gate": {
197
+ "group_size": 64,
198
+ "bits": 8
199
+ },
200
+ "model.layers.27.block_sparse_moe.gate": {
201
+ "group_size": 64,
202
+ "bits": 8
203
+ },
204
+ "model.layers.28.block_sparse_moe.gate": {
205
+ "group_size": 64,
206
+ "bits": 8
207
+ },
208
+ "model.layers.29.block_sparse_moe.gate": {
209
+ "group_size": 64,
210
+ "bits": 8
211
+ },
212
+ "model.layers.30.block_sparse_moe.gate": {
213
+ "group_size": 64,
214
+ "bits": 8
215
+ },
216
+ "model.layers.31.block_sparse_moe.gate": {
217
+ "group_size": 64,
218
+ "bits": 8
219
+ },
220
+ "model.layers.32.block_sparse_moe.gate": {
221
+ "group_size": 64,
222
+ "bits": 8
223
+ },
224
+ "model.layers.33.block_sparse_moe.gate": {
225
+ "group_size": 64,
226
+ "bits": 8
227
+ },
228
+ "model.layers.34.block_sparse_moe.gate": {
229
+ "group_size": 64,
230
+ "bits": 8
231
+ },
232
+ "model.layers.35.block_sparse_moe.gate": {
233
+ "group_size": 64,
234
+ "bits": 8
235
+ },
236
+ "model.layers.36.block_sparse_moe.gate": {
237
+ "group_size": 64,
238
+ "bits": 8
239
+ },
240
+ "model.layers.37.block_sparse_moe.gate": {
241
+ "group_size": 64,
242
+ "bits": 8
243
+ },
244
+ "model.layers.38.block_sparse_moe.gate": {
245
+ "group_size": 64,
246
+ "bits": 8
247
+ },
248
+ "model.layers.39.block_sparse_moe.gate": {
249
+ "group_size": 64,
250
+ "bits": 8
251
+ },
252
+ "model.layers.40.block_sparse_moe.gate": {
253
+ "group_size": 64,
254
+ "bits": 8
255
+ },
256
+ "model.layers.41.block_sparse_moe.gate": {
257
+ "group_size": 64,
258
+ "bits": 8
259
+ },
260
+ "model.layers.42.block_sparse_moe.gate": {
261
+ "group_size": 64,
262
+ "bits": 8
263
+ },
264
+ "model.layers.43.block_sparse_moe.gate": {
265
+ "group_size": 64,
266
+ "bits": 8
267
+ },
268
+ "model.layers.44.block_sparse_moe.gate": {
269
+ "group_size": 64,
270
+ "bits": 8
271
+ },
272
+ "model.layers.45.block_sparse_moe.gate": {
273
+ "group_size": 64,
274
+ "bits": 8
275
+ },
276
+ "model.layers.46.block_sparse_moe.gate": {
277
+ "group_size": 64,
278
+ "bits": 8
279
+ },
280
+ "model.layers.47.block_sparse_moe.gate": {
281
+ "group_size": 64,
282
+ "bits": 8
283
+ },
284
+ "model.layers.48.block_sparse_moe.gate": {
285
+ "group_size": 64,
286
+ "bits": 8
287
+ },
288
+ "model.layers.49.block_sparse_moe.gate": {
289
+ "group_size": 64,
290
+ "bits": 8
291
+ },
292
+ "model.layers.50.block_sparse_moe.gate": {
293
+ "group_size": 64,
294
+ "bits": 8
295
+ },
296
+ "model.layers.51.block_sparse_moe.gate": {
297
+ "group_size": 64,
298
+ "bits": 8
299
+ },
300
+ "model.layers.52.block_sparse_moe.gate": {
301
+ "group_size": 64,
302
+ "bits": 8
303
+ },
304
+ "model.layers.53.block_sparse_moe.gate": {
305
+ "group_size": 64,
306
+ "bits": 8
307
+ },
308
+ "model.layers.54.block_sparse_moe.gate": {
309
+ "group_size": 64,
310
+ "bits": 8
311
+ },
312
+ "model.layers.55.block_sparse_moe.gate": {
313
+ "group_size": 64,
314
+ "bits": 8
315
+ },
316
+ "model.layers.56.block_sparse_moe.gate": {
317
+ "group_size": 64,
318
+ "bits": 8
319
+ },
320
+ "model.layers.57.block_sparse_moe.gate": {
321
+ "group_size": 64,
322
+ "bits": 8
323
+ },
324
+ "model.layers.58.block_sparse_moe.gate": {
325
+ "group_size": 64,
326
+ "bits": 8
327
+ },
328
+ "model.layers.59.block_sparse_moe.gate": {
329
+ "group_size": 64,
330
+ "bits": 8
331
+ },
332
+ "model.layers.60.block_sparse_moe.gate": {
333
+ "group_size": 64,
334
+ "bits": 8
335
+ },
336
+ "model.layers.61.block_sparse_moe.gate": {
337
+ "group_size": 64,
338
+ "bits": 8
339
+ }
340
+ },
341
+ "quantization_config": {
342
+ "group_size": 64,
343
+ "bits": 8,
344
+ "mode": "affine",
345
+ "model.layers.0.block_sparse_moe.gate": {
346
+ "group_size": 64,
347
+ "bits": 8
348
+ },
349
+ "model.layers.1.block_sparse_moe.gate": {
350
+ "group_size": 64,
351
+ "bits": 8
352
+ },
353
+ "model.layers.2.block_sparse_moe.gate": {
354
+ "group_size": 64,
355
+ "bits": 8
356
+ },
357
+ "model.layers.3.block_sparse_moe.gate": {
358
+ "group_size": 64,
359
+ "bits": 8
360
+ },
361
+ "model.layers.4.block_sparse_moe.gate": {
362
+ "group_size": 64,
363
+ "bits": 8
364
+ },
365
+ "model.layers.5.block_sparse_moe.gate": {
366
+ "group_size": 64,
367
+ "bits": 8
368
+ },
369
+ "model.layers.6.block_sparse_moe.gate": {
370
+ "group_size": 64,
371
+ "bits": 8
372
+ },
373
+ "model.layers.7.block_sparse_moe.gate": {
374
+ "group_size": 64,
375
+ "bits": 8
376
+ },
377
+ "model.layers.8.block_sparse_moe.gate": {
378
+ "group_size": 64,
379
+ "bits": 8
380
+ },
381
+ "model.layers.9.block_sparse_moe.gate": {
382
+ "group_size": 64,
383
+ "bits": 8
384
+ },
385
+ "model.layers.10.block_sparse_moe.gate": {
386
+ "group_size": 64,
387
+ "bits": 8
388
+ },
389
+ "model.layers.11.block_sparse_moe.gate": {
390
+ "group_size": 64,
391
+ "bits": 8
392
+ },
393
+ "model.layers.12.block_sparse_moe.gate": {
394
+ "group_size": 64,
395
+ "bits": 8
396
+ },
397
+ "model.layers.13.block_sparse_moe.gate": {
398
+ "group_size": 64,
399
+ "bits": 8
400
+ },
401
+ "model.layers.14.block_sparse_moe.gate": {
402
+ "group_size": 64,
403
+ "bits": 8
404
+ },
405
+ "model.layers.15.block_sparse_moe.gate": {
406
+ "group_size": 64,
407
+ "bits": 8
408
+ },
409
+ "model.layers.16.block_sparse_moe.gate": {
410
+ "group_size": 64,
411
+ "bits": 8
412
+ },
413
+ "model.layers.17.block_sparse_moe.gate": {
414
+ "group_size": 64,
415
+ "bits": 8
416
+ },
417
+ "model.layers.18.block_sparse_moe.gate": {
418
+ "group_size": 64,
419
+ "bits": 8
420
+ },
421
+ "model.layers.19.block_sparse_moe.gate": {
422
+ "group_size": 64,
423
+ "bits": 8
424
+ },
425
+ "model.layers.20.block_sparse_moe.gate": {
426
+ "group_size": 64,
427
+ "bits": 8
428
+ },
429
+ "model.layers.21.block_sparse_moe.gate": {
430
+ "group_size": 64,
431
+ "bits": 8
432
+ },
433
+ "model.layers.22.block_sparse_moe.gate": {
434
+ "group_size": 64,
435
+ "bits": 8
436
+ },
437
+ "model.layers.23.block_sparse_moe.gate": {
438
+ "group_size": 64,
439
+ "bits": 8
440
+ },
441
+ "model.layers.24.block_sparse_moe.gate": {
442
+ "group_size": 64,
443
+ "bits": 8
444
+ },
445
+ "model.layers.25.block_sparse_moe.gate": {
446
+ "group_size": 64,
447
+ "bits": 8
448
+ },
449
+ "model.layers.26.block_sparse_moe.gate": {
450
+ "group_size": 64,
451
+ "bits": 8
452
+ },
453
+ "model.layers.27.block_sparse_moe.gate": {
454
+ "group_size": 64,
455
+ "bits": 8
456
+ },
457
+ "model.layers.28.block_sparse_moe.gate": {
458
+ "group_size": 64,
459
+ "bits": 8
460
+ },
461
+ "model.layers.29.block_sparse_moe.gate": {
462
+ "group_size": 64,
463
+ "bits": 8
464
+ },
465
+ "model.layers.30.block_sparse_moe.gate": {
466
+ "group_size": 64,
467
+ "bits": 8
468
+ },
469
+ "model.layers.31.block_sparse_moe.gate": {
470
+ "group_size": 64,
471
+ "bits": 8
472
+ },
473
+ "model.layers.32.block_sparse_moe.gate": {
474
+ "group_size": 64,
475
+ "bits": 8
476
+ },
477
+ "model.layers.33.block_sparse_moe.gate": {
478
+ "group_size": 64,
479
+ "bits": 8
480
+ },
481
+ "model.layers.34.block_sparse_moe.gate": {
482
+ "group_size": 64,
483
+ "bits": 8
484
+ },
485
+ "model.layers.35.block_sparse_moe.gate": {
486
+ "group_size": 64,
487
+ "bits": 8
488
+ },
489
+ "model.layers.36.block_sparse_moe.gate": {
490
+ "group_size": 64,
491
+ "bits": 8
492
+ },
493
+ "model.layers.37.block_sparse_moe.gate": {
494
+ "group_size": 64,
495
+ "bits": 8
496
+ },
497
+ "model.layers.38.block_sparse_moe.gate": {
498
+ "group_size": 64,
499
+ "bits": 8
500
+ },
501
+ "model.layers.39.block_sparse_moe.gate": {
502
+ "group_size": 64,
503
+ "bits": 8
504
+ },
505
+ "model.layers.40.block_sparse_moe.gate": {
506
+ "group_size": 64,
507
+ "bits": 8
508
+ },
509
+ "model.layers.41.block_sparse_moe.gate": {
510
+ "group_size": 64,
511
+ "bits": 8
512
+ },
513
+ "model.layers.42.block_sparse_moe.gate": {
514
+ "group_size": 64,
515
+ "bits": 8
516
+ },
517
+ "model.layers.43.block_sparse_moe.gate": {
518
+ "group_size": 64,
519
+ "bits": 8
520
+ },
521
+ "model.layers.44.block_sparse_moe.gate": {
522
+ "group_size": 64,
523
+ "bits": 8
524
+ },
525
+ "model.layers.45.block_sparse_moe.gate": {
526
+ "group_size": 64,
527
+ "bits": 8
528
+ },
529
+ "model.layers.46.block_sparse_moe.gate": {
530
+ "group_size": 64,
531
+ "bits": 8
532
+ },
533
+ "model.layers.47.block_sparse_moe.gate": {
534
+ "group_size": 64,
535
+ "bits": 8
536
+ },
537
+ "model.layers.48.block_sparse_moe.gate": {
538
+ "group_size": 64,
539
+ "bits": 8
540
+ },
541
+ "model.layers.49.block_sparse_moe.gate": {
542
+ "group_size": 64,
543
+ "bits": 8
544
+ },
545
+ "model.layers.50.block_sparse_moe.gate": {
546
+ "group_size": 64,
547
+ "bits": 8
548
+ },
549
+ "model.layers.51.block_sparse_moe.gate": {
550
+ "group_size": 64,
551
+ "bits": 8
552
+ },
553
+ "model.layers.52.block_sparse_moe.gate": {
554
+ "group_size": 64,
555
+ "bits": 8
556
+ },
557
+ "model.layers.53.block_sparse_moe.gate": {
558
+ "group_size": 64,
559
+ "bits": 8
560
+ },
561
+ "model.layers.54.block_sparse_moe.gate": {
562
+ "group_size": 64,
563
+ "bits": 8
564
+ },
565
+ "model.layers.55.block_sparse_moe.gate": {
566
+ "group_size": 64,
567
+ "bits": 8
568
+ },
569
+ "model.layers.56.block_sparse_moe.gate": {
570
+ "group_size": 64,
571
+ "bits": 8
572
+ },
573
+ "model.layers.57.block_sparse_moe.gate": {
574
+ "group_size": 64,
575
+ "bits": 8
576
+ },
577
+ "model.layers.58.block_sparse_moe.gate": {
578
+ "group_size": 64,
579
+ "bits": 8
580
+ },
581
+ "model.layers.59.block_sparse_moe.gate": {
582
+ "group_size": 64,
583
+ "bits": 8
584
+ },
585
+ "model.layers.60.block_sparse_moe.gate": {
586
+ "group_size": 64,
587
+ "bits": 8
588
+ },
589
+ "model.layers.61.block_sparse_moe.gate": {
590
+ "group_size": 64,
591
+ "bits": 8
592
+ }
593
+ },
594
+ "rms_norm_eps": 1e-06,
595
+ "rope_theta": 5000000,
596
+ "rotary_dim": 64,
597
+ "scoring_func": "sigmoid",
598
+ "shared_intermediate_size": 0,
599
+ "tie_word_embeddings": false,
600
+ "transformers_version": "4.46.1",
601
+ "use_cache": true,
602
+ "use_mtp": true,
603
+ "use_qk_norm": true,
604
+ "use_routing_bias": true,
605
+ "vocab_size": 200064
606
+ }
generation_config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 200019,
3
+ "do_sample": true,
4
+ "eos_token_id": 200020,
5
+ "temperature": 1.0,
6
+ "top_p": 0.95,
7
+ "top_k": 40,
8
+ "transformers_version": "4.46.1"
9
+ }
model-00001-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d01399535f4c7f06e21e7bdde51c90f833bb3a5e6d44a5db09668c04017e8d3e
3
+ size 4598782614
model-00002-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:409a2518796c87f35b6c45dd4bca122110d25411bea98ad9ba1ee0c479f1b60a
3
+ size 5181537162
model-00003-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9b77317cad72228cab0e9f563f649783b7fc1de3b50522b5d29a494a161ad1d
3
+ size 5181537168
model-00004-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa4ac52f6a02069dea7122d33ca58a2c0ae4d0029af8218a9f3083a0bd48ec39
3
+ size 5229244520
model-00006-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:330651fec38932a91ff09930cb670914c43e07545af68ae7eefa30a8e78ede8a
3
+ size 5181537174
model-00007-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06cb5249c7430d39478ed2a58c5b87bddfceda261d05aeca0563560ec6e68b4b
3
+ size 5229244550
model-00008-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a404d54845aeb6136f7269d2035c1b084b70f898fc5a2f801f9174849c5c45a
3
+ size 5181537206
model-00010-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9b2a640eebff70e8cf6b1b34ac78b97d21009edb071dd43ee717bc9ab995d26
3
+ size 5229244604
model-00012-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f62936689bb9270dd35df0ea2538bf8636303de70a8b84a574caefa2be35ff56
3
+ size 5181537234
model-00013-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9243cf1e64de18cd479b26005a8309e44e6f72fbd3ee14ee299742d845dc8de6
3
+ size 5229244594
model-00014-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:485a9c7906795a60b164f36c7a2f4dbdf14d6a0555f6395a92484f95d998c93e
3
+ size 5181537232
model-00016-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:800603c19d09a130182721089f03992796a5aacd4051ef27f7fb5c912be2f10a
3
+ size 5229244568
model-00017-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6bd12984c00aa6d9c3b41d60a2e79e0305bcd848886485eb0981091bcf52821
3
+ size 5181537236
model-00018-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5fcd0f60d44facb2b006015d9ca1307cb5513533512fce3724ea105bf6498ee
3
+ size 5181537226
model-00019-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6091f175101410776e0ee6f10ac13f581204fc16b0fe5de3e4aa5ca1041108d5
3
+ size 5229244580
model-00020-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bbfa2c0689b90d6f923be3f2c159a0648400b123e70d6899c5c02362c21235dc
3
+ size 5181537232
model-00021-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:15538a8f83ead1fc1935a13b06da57759dd6e5dd3458d566f96ae55a27fbb4ce
3
+ size 5181537234
model-00022-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eed8de603db8d2fbf49c91376ca9154feb9c1e78c0c1f763a69bd84f096ba567
3
+ size 5229244586
model-00023-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:072c46d81de6b77a7f17f8cd2e74bfce1bd449491bec6c1a7667c6f4a9de9105
3
+ size 5181537236
model-00024-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8332bad73fab9c482454b87a461ea2fad2e03e7ff818ce2ae0a6667a0a3e0a7e
3
+ size 5181537206
model-00025-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:42668a2d4a549d32c85c8b43ed9f13454fc30b5fe37ffc7ca7a9672d758e2446
3
+ size 5229244610
model-00026-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f620767204096a6284077a074849f6cb70f7c2e0a48999864c126c4a38f6b3a8
3
+ size 5181537222
model-00027-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f829512c3cfe5b3a103cec7a2446edac596577297b59fa8543904b3f61e33a8d
3
+ size 5181537198
model-00028-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8aa8d712ffa9cb3346db2e4281fd052727efbb2fb3a767c28cc0660904878f91
3
+ size 5229244570
model-00029-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d05266ca7d7f9046b4fa0f831dfb678a77eab7b529e0882a12af05f89dab8de0
3
+ size 5181537216
model-00030-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b54d99239008658c0e4f5811b07521a10217cedf40239b0045aac7983913211c
3
+ size 5181537202
model-00031-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44f3401533b3630a9f36c334e6f56f5bf3885a0a2194685d65964fb297401573
3
+ size 5229244614
model-00032-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e5b7ed6d032a805a9a4ec0c8fb51d9dde3d9a6afdcefd0972ed9d1ace9c6a99
3
+ size 5181537236
model-00033-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecd41d7f326ed0f656b893469c0ed0336a2528f72ad5d5af20fdc383d1edb5d5
3
+ size 5181537230
model-00034-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f731262f01dea5888542fdf34f5ea940053f0e0c013e923309935fff6f533d8e
3
+ size 5229244602
model-00035-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e0c0ed047f6676d9e0121a3e62c2a1f6cce1d189336d620de0ff7849cf82bf2
3
+ size 5181537236
model-00036-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:945dac2db11d8ed2535b6286b2943c555f5bc1d39398b167ae856bfa07f0e669
3
+ size 5181537230
model-00037-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd318601732e8853cd6110022b3a17772e10087bcc554d057d54680e2004234f
3
+ size 5229244584
model-00038-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f335fb2e9f0706acf2cd90bac4c2a20c77de98542acf4e5b87995511e99e4e8f
3
+ size 5181537202
model-00039-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f70801100e3db0db81ec6f05a5a5233e5d4d442cbf2a23ff90c4baaaa408c227
3
+ size 5181537230
model-00040-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:871888e413c1dce6cfc5f63c399ad923bb7b173dea682bb1eab66e20abe646e7
3
+ size 5229244560
model-00041-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7e3192622c4ace0b8cbde91927f81dcb21d65557f533d28939c74436a9cae124
3
+ size 5181537240
model-00042-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:37b0bae6729062c9879744a81037127ef901c49afa9fe61b4c78a8b648a9fdee
3
+ size 5181537218
model-00043-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7bd8fe0ff57a14186164320a1573492deee5a8db4f1ce901c486ed884d9f63c0
3
+ size 5229244560
model-00044-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4cc40761ce6361189bd0af67a5e3cde814f075e84649e59932cdb04c75cbbf73
3
+ size 5181537240
model-00045-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f379d8862c5da75c3a41245aef229a99df67d667e6aff79507932bc51c79127c
3
+ size 5181537230
model-00046-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:03e26c66d85f9e69b4d6160c4e796f89fb5eb8ed03d515d397ae41ff70715f8a
3
+ size 5229244594
model-00047-of-00047.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a1a5ea26a0112eb73564c5bc8b09248dfa1893afb63f32d27147008e3510344
3
+ size 4503401440
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b81e5e5cba2b169e86a0771825a927e9d41b4c4484ded4a286410f41f702f17
3
+ size 15523144