mazesmazes commited on
Commit
be3323d
·
verified ·
1 Parent(s): e0ce22a

Training in progress, step 2500

Browse files
.gitattributes CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ tokenizer_config.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,80 +1,57 @@
1
  ---
2
  library_name: transformers
 
3
  tags:
4
  - generated_from_trainer
5
- datasets:
6
- - generator
7
- model-index:
8
- - name: tiny-audio
9
- results: []
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- # tiny-audio
 
16
 
17
- This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 0.4543
20
 
21
- ## Model description
 
22
 
23
- More information needed
 
 
 
 
24
 
25
- ## Intended uses & limitations
26
-
27
- More information needed
28
-
29
- ## Training and evaluation data
30
 
31
- More information needed
32
 
33
- ## Training procedure
34
 
35
- ### Training hyperparameters
36
 
37
- The following hyperparameters were used during training:
38
- - learning_rate: 0.0001
39
- - train_batch_size: 8
40
- - eval_batch_size: 32
41
- - seed: 123
42
- - gradient_accumulation_steps: 3
43
- - total_train_batch_size: 24
44
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
- - lr_scheduler_type: cosine
46
- - lr_scheduler_warmup_steps: 1000
47
- - training_steps: 20000
48
 
49
- ### Training results
 
 
 
 
50
 
51
- | Training Loss | Epoch | Step | Validation Loss |
52
- |:-------------:|:-----:|:-----:|:---------------:|
53
- | 8.8546 | 0.05 | 1000 | 3.8075 |
54
- | 0.8286 | 0.1 | 2000 | 0.5193 |
55
- | 0.7909 | 0.15 | 3000 | 0.4701 |
56
- | 0.6955 | 0.2 | 4000 | 0.4581 |
57
- | 0.599 | 0.25 | 5000 | 0.4434 |
58
- | 0.6159 | 0.3 | 6000 | 0.4353 |
59
- | 0.5764 | 0.35 | 7000 | 0.4260 |
60
- | 0.602 | 0.05 | 8000 | 0.4298 |
61
- | 0.5363 | 0.1 | 9000 | 0.4430 |
62
- | 0.5643 | 0.15 | 10000 | 0.4636 |
63
- | 0.5135 | 0.2 | 11000 | 0.4423 |
64
- | 0.4419 | 0.25 | 12000 | 0.4473 |
65
- | 0.4848 | 0.3 | 13000 | 0.4539 |
66
- | 0.4692 | 0.35 | 14000 | 0.4481 |
67
- | 0.5154 | 0.05 | 15000 | 0.4482 |
68
- | 0.4736 | 0.1 | 16000 | 0.4522 |
69
- | 0.5097 | 0.15 | 17000 | 0.4537 |
70
- | 0.4729 | 0.2 | 18000 | 0.4542 |
71
- | 0.4142 | 0.25 | 19000 | 0.4543 |
72
- | 0.4718 | 0.3 | 20000 | 0.4543 |
73
 
74
 
75
- ### Framework versions
76
 
77
- - Transformers 4.57.1
78
- - Pytorch 2.8.0+cu128
79
- - Datasets 4.4.1
80
- - Tokenizers 0.22.1
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: transformers
3
+ model_name: tiny-audio
4
  tags:
5
  - generated_from_trainer
6
+ - trl
7
+ - sft
8
+ licence: license
 
 
9
  ---
10
 
11
+ # Model Card for tiny-audio
 
12
 
13
+ This model is a fine-tuned version of [None](https://huggingface.co/None).
14
+ It has been trained using [TRL](https://github.com/huggingface/trl).
15
 
16
+ ## Quick start
 
 
17
 
18
+ ```python
19
+ from transformers import pipeline
20
 
21
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
22
+ generator = pipeline("text-generation", model="mazesmazes/tiny-audio", device="cuda")
23
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
24
+ print(output["generated_text"])
25
+ ```
26
 
27
+ ## Training procedure
 
 
 
 
28
 
29
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/alexkroman-alex-kroman/tiny-audio-tts/runs/icuo0fvj)
30
 
 
31
 
32
+ This model was trained with SFT.
33
 
34
+ ### Framework versions
 
 
 
 
 
 
 
 
 
 
35
 
36
+ - TRL: 0.26.1
37
+ - Transformers: 4.57.3
38
+ - Pytorch: 2.8.0+cu128
39
+ - Datasets: 3.6.0
40
+ - Tokenizers: 0.22.1
41
 
42
+ ## Citations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
 
 
45
 
46
+ Cite TRL as:
47
+
48
+ ```bibtex
49
+ @misc{vonwerra2022trl,
50
+ title = {{TRL: Transformer Reinforcement Learning}},
51
+ author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
52
+ year = 2020,
53
+ journal = {GitHub repository},
54
+ publisher = {GitHub},
55
+ howpublished = {\url{https://github.com/huggingface/trl}}
56
+ }
57
+ ```
added_tokens.json ADDED
The diff for this file is too large to render. See raw diff
 
chat_template.jinja CHANGED
@@ -1,94 +1,6 @@
1
- {# ───── defaults ───── #}
2
- {%- if enable_thinking is not defined -%}
3
- {%- set enable_thinking = true -%}
4
- {%- endif -%}
5
-
6
- {# ───── reasoning mode ───── #}
7
- {%- if enable_thinking -%}
8
- {%- set reasoning_mode = "/think" -%}
9
- {%- else -%}
10
- {%- set reasoning_mode = "/no_think" -%}
11
- {%- endif -%}
12
-
13
- {# ───── header (system message) ───── #}
14
- {{- "<|im_start|>system\n" -}}
15
-
16
- {%- if messages[0].role == "system" -%}
17
- {%- set system_message = messages[0].content -%}
18
- {%- if "/no_think" in system_message -%}
19
- {%- set reasoning_mode = "/no_think" -%}
20
- {%- elif "/think" in system_message -%}
21
- {%- set reasoning_mode = "/think" -%}
22
- {%- endif -%}
23
- {%- set custom_instructions = system_message.replace("/no_think", "").replace("/think", "").rstrip() -%}
24
- {%- endif -%}
25
-
26
- {%- if "/system_override" in system_message -%}
27
- {{- custom_instructions.replace("/system_override", "").rstrip() -}}
28
- {{- "<|im_end|>\n" -}}
29
- {%- else -%}
30
- {{- "## Metadata\n\n" -}}
31
- {{- "Knowledge Cutoff Date: June 2025\n" -}}
32
- {%- set today = strftime_now("%d %B %Y") -%}
33
- {{- "Today Date: " ~ today ~ "\n" -}}
34
- {{- "Reasoning Mode: " + reasoning_mode + "\n\n" -}}
35
-
36
- {{- "## Custom Instructions\n\n" -}}
37
- {%- if custom_instructions -%}
38
- {{- custom_instructions + "\n\n" -}}
39
- {%- elif reasoning_mode == "/think" -%}
40
- {{- "You are a helpful AI assistant named SmolLM, trained by Hugging Face. Your role as an assistant involves thoroughly exploring questions through a systematic thinking process before providing the final precise and accurate solutions. This requires engaging in a comprehensive cycle of analysis, summarizing, exploration, reassessment, reflection, backtracking, and iteration to develop well-considered thinking process. Please structure your response into two main sections: Thought and Solution using the specified format: <think> Thought section </think> Solution section. In the Thought section, detail your reasoning process in steps. Each step should include detailed considerations such as analysing questions, summarizing relevant findings, brainstorming new ideas, verifying the accuracy of the current steps, refining any errors, and revisiting previous steps. In the Solution section, based on various attempts, explorations, and reflections from the Thought section, systematically present the final solution that you deem correct. The Solution section should be logical, accurate, and concise and detail necessary steps needed to reach the conclusion.\n\n" -}}
41
- {%- else -%}
42
- {{- "You are a helpful AI assistant named SmolLM, trained by Hugging Face.\n\n" -}}
43
- {%- endif -%}
44
-
45
- {%- if xml_tools or python_tools or tools -%}
46
- {{- "### Tools\n\n" -}}
47
- {%- if xml_tools or tools -%}
48
- {%- if tools -%}
49
- {%- set xml_tools = tools -%}
50
- {%- endif -%}
51
- {%- set ns = namespace(xml_tool_string="You may call one or more functions to assist with the user query.\nYou are provided with function signatures within <tools></tools> XML tags:\n\n<tools>\n") -%}
52
- {%- for tool in xml_tools[:] -%} {# The slicing makes sure that xml_tools is a list #}
53
- {%- set ns.xml_tool_string = ns.xml_tool_string ~ (tool | string) ~ "\n" -%}
54
- {%- endfor -%}
55
- {%- set xml_tool_string = ns.xml_tool_string + "</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>" -%}
56
- {{- xml_tool_string -}}
57
- {%- endif -%}
58
- {%- if python_tools -%}
59
- {%- set ns = namespace(python_tool_string="When you send a message containing Python code between '<code>' and '</code>' tags, it will be executed in a stateful Jupyter notebook environment, and you will then be given the output to continued reasoning in an agentic loop.\n\nYou can use the following tools in your python code like regular functions:\n<tools>\n") -%}
60
- {%- for tool in python_tools[:] -%} {# The slicing makes sure that python_tools is a list #}
61
- {%- set ns.python_tool_string = ns.python_tool_string ~ (tool | string) ~ "\n" -%}
62
- {%- endfor -%}
63
- {%- set python_tool_string = ns.python_tool_string + "</tools>\n\nThe state persists between code executions: so variables that you define in one step are still available thereafter." -%}
64
- {{- python_tool_string -}}
65
- {%- endif -%}
66
- {{- "\n\n" -}}
67
- {{- "<|im_end|>\n" -}}
68
- {%- endif -%}
69
- {%- endif -%}
70
- {# ───── main loop ───── #}
71
- {%- for message in messages -%}
72
- {%- set content = message.content if message.content is string else "" -%}
73
- {%- if message.role == "user" -%}
74
- {{ "<|im_start|>" + message.role + "\n" + content + "<|im_end|>\n" }}
75
- {%- elif message.role == "assistant" -%}
76
- {% generation %}
77
- {%- if reasoning_mode == "/think" -%}
78
- {{ "<|im_start|>assistant\n" + content.lstrip("\n") + "<|im_end|>\n" }}
79
- {%- else -%}
80
- {{ "<|im_start|>assistant\n" + "<think>\n\n</think>\n" + content.lstrip("\n") + "<|im_end|>\n" }}
81
- {%- endif -%}
82
- {% endgeneration %}
83
- {%- elif message.role == "tool" -%}
84
- {{ "<|im_start|>" + "user\n" + content + "<|im_end|>\n" }}
85
- {%- endif -%}
86
- {%- endfor -%}
87
- {# ───── generation prompt ───── #}
88
- {%- if add_generation_prompt -%}
89
- {%- if reasoning_mode == "/think" -%}
90
- {{ "<|im_start|>assistant\n" }}
91
- {%- else -%}
92
- {{ "<|im_start|>assistant\n" + "<think>\n\n</think>\n" }}
93
- {%- endif -%}
94
- {%- endif -%}
 
1
+ {% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
2
+ You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
3
+ ' }}{% endif %}{{'<|im_start|>' + message['role'] + '
4
+ ' + message['content'] + '<|im_end|>' + '
5
+ '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
6
+ ' }}{% endif %}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,195 +1,38 @@
1
  {
2
  "architectures": [
3
- "ASRModel"
4
  ],
5
- "attn_implementation": "flash_attention_2",
6
- "audio_config": {
7
- "_name_or_path": "openai/whisper-large-v3-turbo",
8
- "activation_dropout": 0.0,
9
- "activation_function": "gelu",
10
- "apply_spec_augment": false,
11
- "architectures": [
12
- "WhisperForConditionalGeneration"
13
- ],
14
- "attention_dropout": 0.0,
15
- "bos_token_id": 50257,
16
- "classifier_proj_size": 256,
17
- "d_model": 1280,
18
- "decoder_attention_heads": 20,
19
- "decoder_ffn_dim": 5120,
20
- "decoder_layerdrop": 0.0,
21
- "decoder_layers": 4,
22
- "decoder_start_token_id": 50258,
23
- "dropout": 0.0,
24
- "dtype": "float16",
25
- "encoder_attention_heads": 20,
26
- "encoder_ffn_dim": 5120,
27
- "encoder_layerdrop": 0.0,
28
- "encoder_layers": 32,
29
- "eos_token_id": 50257,
30
- "init_std": 0.02,
31
- "mask_feature_length": 10,
32
- "mask_feature_min_masks": 0,
33
- "mask_feature_prob": 0.0,
34
- "mask_time_length": 10,
35
- "mask_time_min_masks": 2,
36
- "mask_time_prob": 0.05,
37
- "max_source_positions": 1500,
38
- "max_target_positions": 448,
39
- "median_filter_width": 7,
40
- "model_type": "whisper",
41
- "num_hidden_layers": 32,
42
- "num_mel_bins": 128,
43
- "pad_token_id": 50257,
44
- "scale_embedding": false,
45
- "use_cache": true,
46
- "use_weighted_layer_sum": false,
47
- "vocab_size": 51866
48
- },
49
- "audio_downsample_rate": 5,
50
- "audio_model_id": "openai/whisper-large-v3-turbo",
51
- "audio_sample_rate": 16000,
52
- "auto_map": {
53
- "AutoConfig": "config.ASRConfig",
54
- "AutoModel": "voice.model.ASRModelWithVoice",
55
- "AutoModelForCausalLM": "voice.model.ASRModelWithVoice"
56
- },
57
- "custom_pipelines": {
58
- "automatic-speech-recognition": {
59
- "impl": "asr_pipeline.ASRPipeline",
60
- "pt": [
61
- "AutoModelForSpeechSeq2Seq"
62
- ],
63
- "tf": [],
64
- "type": "audio"
65
- }
66
- },
67
  "dtype": "bfloat16",
68
- "encoder_dim": 1280,
69
- "inference_diversity_penalty": 0.0,
70
- "inference_warmup_tokens": 10,
71
- "llm_dim": 2048,
72
- "max_new_tokens": 128,
73
- "min_new_tokens": 1,
74
- "model_dtype": "bfloat16",
75
- "model_type": "asr_model",
76
- "pipeline_tag": "automatic-speech-recognition",
77
- "projector_dropout": 0.0,
78
- "projector_hidden_dim": 5120,
79
- "projector_init_std": 0.02,
80
- "projector_pool_stride": 2,
81
- "repetition_penalty": 1.05,
82
- "system_prompt": "/no_think /system_override",
83
- "text_config": {
84
- "_name_or_path": "HuggingFaceTB/SmolLM3-3B",
85
- "architectures": [
86
- "SmolLM3ForCausalLM"
87
- ],
88
- "attention_bias": false,
89
- "attention_dropout": 0.0,
90
- "bos_token_id": null,
91
- "dtype": "bfloat16",
92
- "eos_token_id": 128012,
93
- "hidden_act": "silu",
94
- "hidden_size": 2048,
95
- "initializer_range": 0.02,
96
- "intermediate_size": 11008,
97
- "layer_types": [
98
- "full_attention",
99
- "full_attention",
100
- "full_attention",
101
- "full_attention",
102
- "full_attention",
103
- "full_attention",
104
- "full_attention",
105
- "full_attention",
106
- "full_attention",
107
- "full_attention",
108
- "full_attention",
109
- "full_attention",
110
- "full_attention",
111
- "full_attention",
112
- "full_attention",
113
- "full_attention",
114
- "full_attention",
115
- "full_attention",
116
- "full_attention",
117
- "full_attention",
118
- "full_attention",
119
- "full_attention",
120
- "full_attention",
121
- "full_attention",
122
- "full_attention",
123
- "full_attention",
124
- "full_attention",
125
- "full_attention",
126
- "full_attention",
127
- "full_attention",
128
- "full_attention",
129
- "full_attention",
130
- "full_attention",
131
- "full_attention",
132
- "full_attention",
133
- "full_attention"
134
- ],
135
- "max_position_embeddings": 65536,
136
- "max_window_layers": 28,
137
- "mlp_bias": false,
138
- "model_type": "smollm3",
139
- "no_rope_layer_interval": 4,
140
- "no_rope_layers": [
141
- 1,
142
- 1,
143
- 1,
144
- 0,
145
- 1,
146
- 1,
147
- 1,
148
- 0,
149
- 1,
150
- 1,
151
- 1,
152
- 0,
153
- 1,
154
- 1,
155
- 1,
156
- 0,
157
- 1,
158
- 1,
159
- 1,
160
- 0,
161
- 1,
162
- 1,
163
- 1,
164
- 0,
165
- 1,
166
- 1,
167
- 1,
168
- 0,
169
- 1,
170
- 1,
171
- 1,
172
- 0,
173
- 1,
174
- 1,
175
- 1,
176
- 0
177
- ],
178
- "num_attention_heads": 16,
179
- "num_hidden_layers": 36,
180
- "num_key_value_heads": 4,
181
- "pretraining_tp": 2,
182
- "rms_norm_eps": 1e-06,
183
- "rope_scaling": null,
184
- "rope_theta": 5000000.0,
185
- "sliding_window": null,
186
- "use_cache": false,
187
- "use_sliding_window": false,
188
- "vocab_size": 128257
189
  },
190
- "text_model_id": "HuggingFaceTB/SmolLM3-3B",
191
- "transformers_version": "4.57.1",
192
- "use_cache": false,
193
- "user_prompt": "Transcribe: <audio>",
194
- "vocab_size": 128257
195
  }
 
1
  {
2
  "architectures": [
3
+ "LlamaForCausalLM"
4
  ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  "dtype": "bfloat16",
9
+ "eos_token_id": 2,
10
+ "head_dim": 64,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 576,
13
+ "initializer_range": 0.041666666666666664,
14
+ "intermediate_size": 1536,
15
+ "is_llama_config": true,
16
+ "max_position_embeddings": 8192,
17
+ "mlp_bias": false,
18
+ "model_type": "llama",
19
+ "num_attention_heads": 9,
20
+ "num_hidden_layers": 30,
21
+ "num_key_value_heads": 3,
22
+ "pad_token_id": 2,
23
+ "pretraining_tp": 1,
24
+ "rms_norm_eps": 1e-05,
25
+ "rope_interleaved": false,
26
+ "rope_scaling": null,
27
+ "rope_theta": 100000,
28
+ "tie_word_embeddings": true,
29
+ "transformers.js_config": {
30
+ "kv_cache_dtype": {
31
+ "fp16": "float16",
32
+ "q4f16": "float16"
33
+ }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  },
35
+ "transformers_version": "4.57.3",
36
+ "use_cache": true,
37
+ "vocab_size": 114690
 
 
38
  }
generation_config.json CHANGED
@@ -1,8 +1,10 @@
1
  {
2
- "do_sample": true,
3
- "eos_token_id": 128012,
4
- "pad_token_id": 128004,
5
- "temperature": 0.6,
6
- "top_p": 0.95,
7
- "transformers_version": "4.57.1"
 
 
8
  }
 
1
  {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": [
5
+ 2,
6
+ 2
7
+ ],
8
+ "pad_token_id": 2,
9
+ "transformers_version": "4.57.3"
10
  }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:55f01490cfd7f4ee39d5ccac2e5bace6bf75fd95e68f431604cee1dbcacf1c76
3
- size 73410040
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b0574f04cc797cea4c86ca207e912a87e32ab8313566403e027ea760aef5009
3
+ size 344560440
special_tokens_map.json CHANGED
@@ -1,13 +1,15 @@
1
  {
2
  "additional_special_tokens": [
3
- {
4
- "content": "<audio>",
5
- "lstrip": false,
6
- "normalized": false,
7
- "rstrip": false,
8
- "single_word": false
9
- }
10
  ],
 
 
 
 
 
 
 
11
  "eos_token": {
12
  "content": "<|im_end|>",
13
  "lstrip": false,
@@ -15,5 +17,18 @@
15
  "rstrip": false,
16
  "single_word": false
17
  },
18
- "pad_token": "<|finetune_right_pad_id|>"
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  }
 
1
  {
2
  "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>"
 
 
 
 
 
5
  ],
6
+ "bos_token": {
7
+ "content": "<|im_start|>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false
12
+ },
13
  "eos_token": {
14
  "content": "<|im_end|>",
15
  "lstrip": false,
 
17
  "rstrip": false,
18
  "single_word": false
19
  },
20
+ "pad_token": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false
26
+ },
27
+ "unk_token": {
28
+ "content": "<|endoftext|>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ }
34
  }
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d4aeaf198f783cbf58d8cd59812baac429ffe49147bf9648f6618de20b8d4a4c
3
- size 17209003
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9a0a439f19c272474f9c9213ea2665d1f1cf90eb7f2f6a71b40a919554f078c
3
+ size 15781850
tokenizer_config.json CHANGED
Binary files a/tokenizer_config.json and b/tokenizer_config.json differ
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:934711503059ebb39d1310c63b8839ab2e069e5a92ea1237b9b491a943a97462
3
- size 5969
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e1ab63007bf529af1bfd6bc8ff73f65550ead28b1db743f117993234555a69a
3
+ size 6289
vocab.json ADDED
The diff for this file is too large to render. See raw diff