GPT-2 Emotion-Conditioned Text Generation (Prefix Method)
Fine-tuned GPT-2 model for emotion-controlled dialogue generation using natural language prefix conditioning.
Model Details
- Developed by: VanshajR
- Base Model:
gpt2(124M parameters) - Method: Prefix Conditioning (
"Respond with [EMOTION] emotion: [context]") - Task: Emotion-controlled response generation
- Dataset: DailyDialog (76K train, 6.7K test)
- Language: English
- License: MIT
Performance
Emotion Accuracy
Evaluated on 6,740 test samples using RoBERTa emotion classifier:
| Model | Emotion Accuracy | Improvement vs Baseline |
|---|---|---|
| Prefix-Small (This Model) | 38.2% β | +9.8pp |
| Token-Small | 30.8% | +2.5pp |
| Prefix-Medium (355M) | 35.6% | +7.3pp |
| Baseline (no conditioning) | 28.3% | - |
| Random Baseline | 14.3% | - |
Key Achievements:
- β 2.67x better than random chance
- β 34.6% relative improvement over baseline
- β More efficient than larger models (Small > Medium with limited data)
- β Competitive with literature (35-48% range for similar tasks)
Generation Quality
- Perplexity: 24.5 (maintained fluency)
- Emotion Success Rate: 38.2% match target emotion
- Method: Natural language prefixes (no architecture changes)
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("VanshajR/gpt2-emotion-prefix")
model = AutoModelForCausalLM.from_pretrained("VanshajR/gpt2-emotion-prefix")
# Generate emotion-controlled response
emotion = "happy"
context = "How was your day today?"
prefix = f"Respond with {emotion} emotion: {context}"
inputs = tokenizer(prefix, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=100,
num_return_sequences=1,
temperature=0.8,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
# Example: "Respond with happy emotion: How was your day today? It was amazing! I got promoted at work."
Supported Emotions
π happy | π’ sad | π angry | π¨ fear | π€’ disgust | π² surprise | π neutral
Best Practices
# For better results:
# 1. Use temperature 0.7-0.9 for natural responses
# 2. Use top_p=0.9 for diversity
# 3. Keep context concise (1-2 sentences)
# 4. Prefix format: "Respond with [emotion] emotion: [context]"
# Example with emotion verification
from transformers import pipeline
# Load emotion classifier for verification
classifier = pipeline("text-classification", model="VanshajR/roberta-emotion-7class")
# Generate with happy emotion
emotion = "happy"
context = "Tell me about your weekend"
prefix = f"Respond with {emotion} emotion: {context}"
inputs = tokenizer(prefix, return_tensors="pt")
outputs = model.generate(**inputs, max_length=80, temperature=0.8, top_p=0.9, do_sample=True)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Verify emotion
predicted = classifier(response)[0]
print(f"Target: {emotion}, Predicted: {predicted['label']}, Confidence: {predicted['score']:.2%}")
Training Details
Training Data
- Dataset: DailyDialog
- Training Samples: 76,052 multi-turn conversations
- Test Samples: 6,740
- Emotions: 7 classes (happy, sad, angry, fear, disgust, surprise, neutral)
- Format:
"Respond with [emotion] emotion: [dialogue_context]" β [response]
Training Procedure
- Optimizer: AdamW (lr=5e-5, weight_decay=0.01)
- Batch Size: 8 (with gradient accumulation=4, effective=32)
- Epochs: 3
- Max Length: 128 tokens
- Training Regime: fp32
- Gradient Clipping: 1.0
Compute Infrastructure
- Hardware: NVIDIA RTX 3070 (8GB VRAM)
- Training Time: ~8 hours (3 epochs)
- Framework: PyTorch 2.1.0, Transformers 4.35.0
Method: Prefix Conditioning
This model uses natural language prefix conditioning - a lightweight method that doesn't modify the model architecture:
- Baseline:
"How are you?" β "I'm fine." - Prefix-Conditioned:
"Respond with happy emotion: How are you?" β "I'm doing great! Everything's wonderful!"
Why Prefix > Token:
- More interpretable (natural language)
- Better performance (38.2% vs 30.8%)
- No special tokens needed
- Easier to use
Limitations and Considerations
- Accuracy: 38.2% emotion match (not perfect control)
- Evaluation: Automatic classifier (57.8% accuracy) - not human evaluation
- Language: English conversational text only
- Domain: Trained on daily conversations (may not work well for formal/technical text)
- Size: Small model (124M) - less powerful than larger LLMs
What affects accuracy:
- Classifier ceiling (57.8% max possible)
- Task difficulty (humans disagree 25-30%)
- Model size (124M vs billions in modern LLMs)
- Training data size (76K vs millions)
Comparison with Literature
| Paper | Model Size | Emotion Acc. | Dataset |
|---|---|---|---|
| Colombo et al. (2019) | 117M | 35-42% | DailyDialog |
| Zhou et al. (2018) | 256M | ~45% | DailyDialog |
| Rashkin et al. (2019) | 345M-1.5B | 40-48% | EmpatheticDialogues |
| This Work | 124M | 38.2% | DailyDialog |
Our result falls within the published range for similar lightweight methods.
Intended Use
β Recommended:
- Research on controllable text generation
- Emotion-aware chatbot prototypes
- Educational demonstrations of conditioning methods
- Dialogue system experiments
β Not Recommended:
- Production chatbots (use larger models)
- Mental health applications
- High-stakes communication
- Non-English languages
Citation
@misc{vanshajr2024gpt2emotion,
author = {Vanshaj R},
title = {GPT-2 Emotion-Conditioned Text Generation via Prefix Conditioning},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/VanshajR/gpt2-emotion-prefix}
}
Related Work
Part of the Emotion-Controlled Response Generation project:
- π GitHub Repository
- π RoBERTa Emotion Classifier
- π Full Project Report
- π¬ Live Demo App
Acknowledgments
- DailyDialog Dataset: Li et al. (2017)
- Base Model: OpenAI GPT-2
- Evaluation: GoEmotions-trained RoBERTa classifier
- Downloads last month
- 29
Model tree for VanshajR/gpt2-emotion-prefix
Base model
openai-community/gpt2