Update README.md (#2)
Browse files- Update README.md (ab7c317d3968aaa2eb76fadc4a68ff64235510ef)
README.md
CHANGED
|
@@ -13,6 +13,27 @@ DCLM-1B is a 1.4 billion parameter language model trained on the DCLM-Baseline d
|
|
| 13 |
|
| 14 |
The instruction tuned version of this model is available here: https://huggingface.co/TRI-ML/DCLM-1B-IT
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
## Evaluation
|
| 17 |
|
| 18 |
We evaluate DCLM-1B using the [llm-foundry](https://github.com/mosaicml/llm-foundry) eval suite, and compare to recently released small models on key benchmarks.
|
|
|
|
| 13 |
|
| 14 |
The instruction tuned version of this model is available here: https://huggingface.co/TRI-ML/DCLM-1B-IT
|
| 15 |
|
| 16 |
+
## Quickstart
|
| 17 |
+
First install open_lm
|
| 18 |
+
```
|
| 19 |
+
pip install git+https://github.com/mlfoundations/open_lm.git
|
| 20 |
+
```
|
| 21 |
+
|
| 22 |
+
Then you can load the model using HF's Auto classes as follows:
|
| 23 |
+
```python
|
| 24 |
+
from open_lm.hf import *
|
| 25 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 26 |
+
tokenizer = AutoTokenizer.from_pretrained("TRI-ML/DCLM-1B")
|
| 27 |
+
model = AutoModelForCausalLM.from_pretrained("TRI-ML/DCLM-1B")
|
| 28 |
+
|
| 29 |
+
inputs = tokenizer(["Machine learning is"], return_tensors="pt")
|
| 30 |
+
gen_kwargs = {"max_new_tokens": 50, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
|
| 31 |
+
output = model.generate(inputs['input_ids'], **gen_kwargs)
|
| 32 |
+
output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
|
| 33 |
+
print(output)
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
|
| 37 |
## Evaluation
|
| 38 |
|
| 39 |
We evaluate DCLM-1B using the [llm-foundry](https://github.com/mosaicml/llm-foundry) eval suite, and compare to recently released small models on key benchmarks.
|