Update README.md
Browse files
README.md
CHANGED
|
@@ -56,17 +56,24 @@ print(generated_text[0]['generated_text'])
|
|
| 56 |
## Training Details
|
| 57 |
|
| 58 |
Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
|
|
|
|
| 59 |
Fine-tuning script: finetune_werther.py (available in the repository if shared)
|
|
|
|
| 60 |
Training epochs: 5 (or more, if you further trained it)
|
|
|
|
| 61 |
Dataset Size (approximately): ~57,987 tokens (from one novel)
|
|
|
|
| 62 |
Block size: 512 tokens
|
|
|
|
| 63 |
Training framework: Hugging Face transformers library with PyTorch backend.
|
| 64 |
|
| 65 |
## Limitations and Bias
|
| 66 |
|
| 67 |
Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.
|
| 68 |
Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.
|
|
|
|
| 69 |
Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.
|
|
|
|
| 70 |
Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.
|
| 71 |
|
| 72 |
## Future Work
|
|
|
|
| 56 |
## Training Details
|
| 57 |
|
| 58 |
Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
|
| 59 |
+
|
| 60 |
Fine-tuning script: finetune_werther.py (available in the repository if shared)
|
| 61 |
+
|
| 62 |
Training epochs: 5 (or more, if you further trained it)
|
| 63 |
+
|
| 64 |
Dataset Size (approximately): ~57,987 tokens (from one novel)
|
| 65 |
+
|
| 66 |
Block size: 512 tokens
|
| 67 |
+
|
| 68 |
Training framework: Hugging Face transformers library with PyTorch backend.
|
| 69 |
|
| 70 |
## Limitations and Bias
|
| 71 |
|
| 72 |
Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.
|
| 73 |
Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.
|
| 74 |
+
|
| 75 |
Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.
|
| 76 |
+
|
| 77 |
Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.
|
| 78 |
|
| 79 |
## Future Work
|