ZHZisZZ commited on
Commit
a7e9732
Β·
verified Β·
1 Parent(s): 7aff11a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -5,13 +5,13 @@ license: apache-2.0
5
  <center> <div style="text-align: center;"> <img src="https://raw.githubusercontent.com/ZHZisZZ/dllm/main/assets/logo.gif" width="400" />
6
  </div> </center>
7
 
8
- # ModernBERT-large-chat-v0
9
 
10
- ModernBERT-large-chat-v0 is a diffusion-based generative variant of [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large), finetuned using the [dLLM](https://github.com/ZHZisZZ/dllm) framework.
11
 
12
  ## Model Overview
13
 
14
- ModernBERT-large-chat-v0 has the following features:
15
 
16
  - **Method**: [Masked Diffusion Language Modeling (MDLM)](https://arxiv.org/abs/2406.07524)
17
  - **Framework**: [dLLM](https://github.com/ZHZisZZ/dllm)
@@ -110,8 +110,8 @@ def generate(model, prompt, steps=128, gen_length=128, block_length=64, temperat
110
 
111
 
112
  device = 'cuda'
113
- model = AutoModelForMaskedLM.from_pretrained('dllm-collection/ModernBERT-large-chat-v0', dtype=torch.bfloat16).to(device).eval()
114
- tokenizer = AutoTokenizer.from_pretrained('dllm-collection/ModernBERT-large-chat-v0')
115
 
116
  prompt = "Lily can run 12 kilometers per hour for 4 hours. After that, she runs 6 kilometers per hour. How many kilometers can she run in 8 hours?"
117
  m = [
@@ -144,20 +144,20 @@ Follow the Github repo's demo script [examples/bert/chat.py](https://github.com/
144
 
145
  ```shell
146
  python -u examples/bert/chat.py \
147
- --model_name_or_path dllm-collection/ModernBERT-large-chat-v0 \
148
  --chat True
149
  ```
150
 
151
  ## Evaluation
152
  |                     | LAMBADA | GSM8K | CEval | BBH | MATH | MMLU | Winogrande | HellaSwag | CMMLU |
153
  |:------------------------------------|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
154
- | ModernBERT-base-chat-v0 | 49.3 | 5.9 | 25.0 | 17.9 | 3.1 | 26.1 | 49.7 | 41.0 | 24.3 |
155
- | ModernBERT-large-chat-v0 | 46.3 | 17.1 | 24.6 | 25.1 | 3.8 | 33.5 | 53.1 | 45.0 | 27.5 |
156
 
157
  <!-- <p align="left" style="color: #808080; font-size: 0.9em;">
158
  Table 1. Evaluation results of
159
- ModernBERT-base-chat-v0 and
160
- ModernBERT-large-chat-v0.
161
  All results are evaluated using
162
  <a href="https://github.com/ZHZisZZ/dllm/tree/main" style="color: #808080; text-decoration: underline;">
163
  dLLM
@@ -167,16 +167,16 @@ python -u examples/bert/chat.py \
167
  </a>.
168
  </p> -->
169
 
170
- To automatically evaluate ModernBERT-large-chat-v0 on all benchmarks, run:
171
  ```shell
172
  bash examples/bert/eval.sh \
173
- --model_name_or_path "dllm-collection/ModernBERT-large-chat-v0"
174
  ```
175
 
176
 
177
  ## Citation
178
 
179
- If you use ModernBERT-large-chat-v0 or dLLM, please cite:
180
 
181
  ```bibtex
182
  @misc{dllm,
 
5
  <center> <div style="text-align: center;"> <img src="https://raw.githubusercontent.com/ZHZisZZ/dllm/main/assets/logo.gif" width="400" />
6
  </div> </center>
7
 
8
+ # ModernBERT-large-chat-v0.1
9
 
10
+ ModernBERT-large-chat-v0.1 is a diffusion-based generative variant of [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large), finetuned using the [dLLM](https://github.com/ZHZisZZ/dllm) framework.
11
 
12
  ## Model Overview
13
 
14
+ ModernBERT-large-chat-v0.1 has the following features:
15
 
16
  - **Method**: [Masked Diffusion Language Modeling (MDLM)](https://arxiv.org/abs/2406.07524)
17
  - **Framework**: [dLLM](https://github.com/ZHZisZZ/dllm)
 
110
 
111
 
112
  device = 'cuda'
113
+ model = AutoModelForMaskedLM.from_pretrained('dllm-collection/ModernBERT-large-chat-v0.1', dtype=torch.bfloat16).to(device).eval()
114
+ tokenizer = AutoTokenizer.from_pretrained('dllm-collection/ModernBERT-large-chat-v0.1')
115
 
116
  prompt = "Lily can run 12 kilometers per hour for 4 hours. After that, she runs 6 kilometers per hour. How many kilometers can she run in 8 hours?"
117
  m = [
 
144
 
145
  ```shell
146
  python -u examples/bert/chat.py \
147
+ --model_name_or_path dllm-collection/ModernBERT-large-chat-v0.1 \
148
  --chat True
149
  ```
150
 
151
  ## Evaluation
152
  |                     | LAMBADA | GSM8K | CEval | BBH | MATH | MMLU | Winogrande | HellaSwag | CMMLU |
153
  |:------------------------------------|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
154
+ | ModernBERT-base-chat-v0.1 | 49.3 | 5.9 | 25.0 | 17.9 | 3.1 | 26.1 | 49.7 | 41.0 | 24.3 |
155
+ | ModernBERT-large-chat-v0.1 | 46.3 | 17.1 | 24.6 | 25.1 | 3.8 | 33.5 | 53.1 | 45.0 | 27.5 |
156
 
157
  <!-- <p align="left" style="color: #808080; font-size: 0.9em;">
158
  Table 1. Evaluation results of
159
+ ModernBERT-base-chat-v0.1 and
160
+ ModernBERT-large-chat-v0.1.
161
  All results are evaluated using
162
  <a href="https://github.com/ZHZisZZ/dllm/tree/main" style="color: #808080; text-decoration: underline;">
163
  dLLM
 
167
  </a>.
168
  </p> -->
169
 
170
+ To automatically evaluate ModernBERT-large-chat-v0.1 on all benchmarks, run:
171
  ```shell
172
  bash examples/bert/eval.sh \
173
+ --model_name_or_path "dllm-collection/ModernBERT-large-chat-v0.1"
174
  ```
175
 
176
 
177
  ## Citation
178
 
179
+ If you use ModernBERT-large-chat-v0.1 or dLLM, please cite:
180
 
181
  ```bibtex
182
  @misc{dllm,