dllm-collection
/

ModernBERT-large-chat-v0.1

Safetensors

modernbert

Model card Files Files and versions

xet

Community

ZHZisZZ commited on 11 days ago

Commit

a7e9732

verified ·

1 Parent(s): 7aff11a

Update README.md

Browse files

Files changed (1) hide show

README.md +13 -13

README.md CHANGED Viewed

@@ -5,13 +5,13 @@ license: apache-2.0
 <center> <div style="text-align: center;"> <img src="https://raw.githubusercontent.com/ZHZisZZ/dllm/main/assets/logo.gif" width="400" />
  </div> </center>
-# ModernBERT-large-chat-v0
-ModernBERT-large-chat-v0 is a diffusion-based generative variant of [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large), finetuned using the [dLLM](https://github.com/ZHZisZZ/dllm) framework.
 ## Model Overview
-ModernBERT-large-chat-v0 has the following features:
 - **Method**: [Masked Diffusion Language Modeling (MDLM)](https://arxiv.org/abs/2406.07524)
 - **Framework**: [dLLM](https://github.com/ZHZisZZ/dllm)
@@ -110,8 +110,8 @@ def generate(model, prompt, steps=128, gen_length=128, block_length=64, temperat
 device = 'cuda'
-model = AutoModelForMaskedLM.from_pretrained('dllm-collection/ModernBERT-large-chat-v0', dtype=torch.bfloat16).to(device).eval()
-tokenizer = AutoTokenizer.from_pretrained('dllm-collection/ModernBERT-large-chat-v0')
 prompt = "Lily can run 12 kilometers per hour for 4 hours. After that, she runs 6 kilometers per hour. How many kilometers can she run in 8 hours?"
 m = [
@@ -144,20 +144,20 @@ Follow the Github repo's demo script [examples/bert/chat.py](https://github.com/
 ```shell
 python -u examples/bert/chat.py \
-  --model_name_or_path dllm-collection/ModernBERT-large-chat-v0 \
   --chat True
 ```
 ## Evaluation
 |                     | LAMBADA | GSM8K | CEval | BBH | MATH | MMLU | Winogrande | HellaSwag | CMMLU |
 |:------------------------------------|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
-| ModernBERT-base-chat-v0 | 49.3 | 5.9 | 25.0 | 17.9 | 3.1 | 26.1 | 49.7 | 41.0 | 24.3 |
-| ModernBERT-large-chat-v0 | 46.3 | 17.1 | 24.6 | 25.1 | 3.8 | 33.5 | 53.1 | 45.0 | 27.5 |
 <!-- <p align="left" style="color: #808080; font-size: 0.9em;">
   Table 1. Evaluation results of
-  ModernBERT-base-chat-v0 and
-  ModernBERT-large-chat-v0.
   All results are evaluated using
   <a href="https://github.com/ZHZisZZ/dllm/tree/main" style="color: #808080; text-decoration: underline;">
     dLLM
@@ -167,16 +167,16 @@ python -u examples/bert/chat.py \
   </a>.
 </p> -->
-To automatically evaluate ModernBERT-large-chat-v0 on all benchmarks, run:
 ```shell
 bash examples/bert/eval.sh \
-  --model_name_or_path "dllm-collection/ModernBERT-large-chat-v0"
 ```
 ## Citation
-If you use ModernBERT-large-chat-v0 or dLLM, please cite:
 ```bibtex
 @misc{dllm,

 <center> <div style="text-align: center;"> <img src="https://raw.githubusercontent.com/ZHZisZZ/dllm/main/assets/logo.gif" width="400" />
  </div> </center>
+# ModernBERT-large-chat-v0.1
+ModernBERT-large-chat-v0.1 is a diffusion-based generative variant of [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large), finetuned using the [dLLM](https://github.com/ZHZisZZ/dllm) framework.
 ## Model Overview
+ModernBERT-large-chat-v0.1 has the following features:
 - **Method**: [Masked Diffusion Language Modeling (MDLM)](https://arxiv.org/abs/2406.07524)
 - **Framework**: [dLLM](https://github.com/ZHZisZZ/dllm)
 device = 'cuda'
+model = AutoModelForMaskedLM.from_pretrained('dllm-collection/ModernBERT-large-chat-v0.1', dtype=torch.bfloat16).to(device).eval()
+tokenizer = AutoTokenizer.from_pretrained('dllm-collection/ModernBERT-large-chat-v0.1')
 prompt = "Lily can run 12 kilometers per hour for 4 hours. After that, she runs 6 kilometers per hour. How many kilometers can she run in 8 hours?"
 m = [
 ```shell
 python -u examples/bert/chat.py \
+  --model_name_or_path dllm-collection/ModernBERT-large-chat-v0.1 \
   --chat True
 ```
 ## Evaluation
 |                     | LAMBADA | GSM8K | CEval | BBH | MATH | MMLU | Winogrande | HellaSwag | CMMLU |
 |:------------------------------------|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|:----:|
+| ModernBERT-base-chat-v0.1 | 49.3 | 5.9 | 25.0 | 17.9 | 3.1 | 26.1 | 49.7 | 41.0 | 24.3 |
+| ModernBERT-large-chat-v0.1 | 46.3 | 17.1 | 24.6 | 25.1 | 3.8 | 33.5 | 53.1 | 45.0 | 27.5 |
 <!-- <p align="left" style="color: #808080; font-size: 0.9em;">
   Table 1. Evaluation results of
+  ModernBERT-base-chat-v0.1 and
+  ModernBERT-large-chat-v0.1.
   All results are evaluated using
   <a href="https://github.com/ZHZisZZ/dllm/tree/main" style="color: #808080; text-decoration: underline;">
     dLLM
   </a>.
 </p> -->
+To automatically evaluate ModernBERT-large-chat-v0.1 on all benchmarks, run:
 ```shell
 bash examples/bert/eval.sh \
+  --model_name_or_path "dllm-collection/ModernBERT-large-chat-v0.1"
 ```
 ## Citation
+If you use ModernBERT-large-chat-v0.1 or dLLM, please cite:
 ```bibtex
 @misc{dllm,