ajsbsd commited on
Commit
3b64351
·
verified ·
1 Parent(s): 2d7af0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -56,17 +56,24 @@ print(generated_text[0]['generated_text'])
56
  ## Training Details
57
 
58
  Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
 
59
  Fine-tuning script: finetune_werther.py (available in the repository if shared)
 
60
  Training epochs: 5 (or more, if you further trained it)
 
61
  Dataset Size (approximately): ~57,987 tokens (from one novel)
 
62
  Block size: 512 tokens
 
63
  Training framework: Hugging Face transformers library with PyTorch backend.
64
 
65
  ## Limitations and Bias
66
 
67
  Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.
68
  Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.
 
69
  Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.
 
70
  Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.
71
 
72
  ## Future Work
 
56
  ## Training Details
57
 
58
  Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
59
+
60
  Fine-tuning script: finetune_werther.py (available in the repository if shared)
61
+
62
  Training epochs: 5 (or more, if you further trained it)
63
+
64
  Dataset Size (approximately): ~57,987 tokens (from one novel)
65
+
66
  Block size: 512 tokens
67
+
68
  Training framework: Hugging Face transformers library with PyTorch backend.
69
 
70
  ## Limitations and Bias
71
 
72
  Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.
73
  Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.
74
+
75
  Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.
76
+
77
  Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.
78
 
79
  ## Future Work