Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,79 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- bitext/Bitext-customer-support-llm-chatbot-training-dataset
|
| 5 |
+
- MohammadOthman/mo-customer-support-tweets-945k
|
| 6 |
+
- taskydata/baize_chatbot
|
| 7 |
+
language:
|
| 8 |
+
- en
|
| 9 |
+
base_model:
|
| 10 |
+
- deepseek-ai/DeepSeek-R1-0528
|
| 11 |
+
- unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit
|
| 12 |
+
new_version: Aeshp/deepseekR1_tunedchat
|
| 13 |
+
pipeline_tag: text-generation
|
| 14 |
+
library_name: transformers
|
| 15 |
+
tags:
|
| 16 |
+
- bitsandbytes
|
| 17 |
+
- deepseek
|
| 18 |
+
- unsloth
|
| 19 |
+
- tensorboard
|
| 20 |
+
- text-generation-inference
|
| 21 |
+
- llama
|
| 22 |
+
- 5B
|
| 23 |
+
---
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
# Aeshp/deepseekR1_tunedchat
|
| 27 |
+
|
| 28 |
+
This model is a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B), loaded via Unsloth in 4-bit as [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit). It has been trained on customer service and general chat datasets:
|
| 29 |
+
|
| 30 |
+
- [taskydata/baize_chatbot](https://huggingface.co/datasets/taskydata/baize_chatbot)
|
| 31 |
+
- [MohammadOthman/mo-customer-support-tweets-945k](https://huggingface.co/datasets/MohammadOthman/mo-customer-support-tweets-945k)
|
| 32 |
+
- [bitext/Bitext-customer-support-llm-chatbot-training-dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset)
|
| 33 |
+
|
| 34 |
+
The training was performed in three steps, and the final weights were merged with the base model and pushed here.
|
| 35 |
+
It is a light model.
|
| 36 |
+
|
| 37 |
+
## 📝 License
|
| 38 |
+
|
| 39 |
+
This model is released under the MIT license, allowing free use, modification, and further fine-tuning.
|
| 40 |
+
|
| 41 |
+
## 💡 How to Fine-Tune Further
|
| 42 |
+
|
| 43 |
+
All code and instructions for further fine-tuning, inference, and pushing to the Hugging Face Hub are available in the open-source GitHub repository:
|
| 44 |
+
**[https://github.com/Aeshp/deepseekR1finetune](https://github.com/Aeshp/deepseekR1finetune)**
|
| 45 |
+
|
| 46 |
+
- You can fine-tune this model on your own domain-specific data.
|
| 47 |
+
- Please adjust hyperparameters and dataset size as needed.
|
| 48 |
+
- Example scripts and notebooks are provided for both base model and checkpoint-based fine-tuning.
|
| 49 |
+
|
| 50 |
+
## ⚠️ Notes
|
| 51 |
+
|
| 52 |
+
- The model may sometimes hallucinate, as is common with LLMs.
|
| 53 |
+
- For best results, use a large, high-quality dataset for further fine-tuning to avoid overfitting.
|
| 54 |
+
|
| 55 |
+
## 📚 References
|
| 56 |
+
|
| 57 |
+
### Hugging Face Models
|
| 58 |
+
- [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)
|
| 59 |
+
- [deepseek-ai/DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)
|
| 60 |
+
- [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
|
| 61 |
+
- [unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-unsloth-bnb-4bit)
|
| 62 |
+
|
| 63 |
+
### Datasets
|
| 64 |
+
- [taskydata/baize_chatbot](https://huggingface.co/datasets/taskydata/baize_chatbot)
|
| 65 |
+
- [MohammadOthman/mo-customer-support-tweets-945k](https://huggingface.co/datasets/MohammadOthman/mo-customer-support-tweets-945k)
|
| 66 |
+
- [bitext/Bitext-customer-support-llm-chatbot-training-dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset)
|
| 67 |
+
|
| 68 |
+
### GitHub Repositories
|
| 69 |
+
- [Aeshp/deepseekR1finetune](https://github.com/Aeshp/deepseekR1finetune)
|
| 70 |
+
- [meta-llama/llama](https://github.com/meta-llama/llama)
|
| 71 |
+
- [deepseek-ai/DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1)
|
| 72 |
+
- [Unsloth Documentation](https://docs.unsloth.ai/)
|
| 73 |
+
|
| 74 |
+
### Papers
|
| 75 |
+
- [DeepSeek R1 Paper](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf)
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
+
|
| 79 |
+
For all usage instructions, fine-tuning guides, and code, please see the [GitHub repository](https://github.com/Aeshp/deepseekR1finetune).
|