ajsbsd
/

distilgpt2-werther-finetuned

Text Generation

german-literature

creative-writing

Model card Files Files and versions

ajsbsd commited on Jun 11, 2025

Commit

3b64351

·

verified ·

1 Parent(s): 2d7af0b

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -56,17 +56,24 @@ print(generated_text[0]['generated_text'])
 ## Training Details
 Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
 Fine-tuning script: finetune_werther.py (available in the repository if shared)
 Training epochs: 5 (or more, if you further trained it)
 Dataset Size (approximately): ~57,987 tokens (from one novel)
 Block size: 512 tokens
 Training framework: Hugging Face transformers library with PyTorch backend.
 ## Limitations and Bias
 Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.
 Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.
 Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.
 Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.
 ## Future Work

 ## Training Details
 Hardware: Local NVIDIA GeForce RTX 4070 Laptop GPU (or similar)
 Fine-tuning script: finetune_werther.py (available in the repository if shared)
 Training epochs: 5 (or more, if you further trained it)
 Dataset Size (approximately): ~57,987 tokens (from one novel)
 Block size: 512 tokens
 Training framework: Hugging Face transformers library with PyTorch backend.
 ## Limitations and Bias
 Repetitive Output: Due to the relatively small size of the fine-tuning dataset (a single novel) and the base model's architecture, the generated text can often become repetitive or loop on certain phrases. This is a common characteristic of smaller models fine-tuned on limited domain-specific data.
 Lack of Long-Range Coherence: The model may struggle to maintain a coherent narrative or theme over longer generated passages.
 Bias: The model will reflect any biases present in the original "The Sorrows of Young Werther" text. It will also reflect the melancholic, romantic, and somewhat obsessive tone of the original work.
 Limited Knowledge: It only knows what was present in its pre-training data and what it learned from Werther. It does not have general world knowledge.
 ## Future Work