T5-Small Fine-tuned for Academic Paper Summarization
Model Description
This is a T5-small model fine-tuned on 30,000 arXiv papers for academic text summarization.
Performance
Compared to base T5-small:
- ROUGE-1: +28.29% improvement
- ROUGE-2: +46.45% improvement โญ
- ROUGE-L: +27.85% improvement
Additional metrics:
- BERTScore: +2.14% improvement
- BARTScore: +6.62% improvement
- FactCC: +28.24% improvement
Overall: 6/8 metrics improved (75% win rate)
Intended Use
This model is specifically designed for:
- Summarizing academic papers
- Generating abstracts from research articles
- Scientific document summarization
- Technical content summarization
How to Use
Quick Start
from transformers import T5Tokenizer, T5ForConditionalGeneration
# Load model
tokenizer = T5Tokenizer.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer")
model = T5ForConditionalGeneration.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer")
# Prepare input
text = "summarize: Your academic paper text here..."
inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)
# Generate summary
outputs = model.generate(
**inputs,
max_length=128,
num_beams=4,
early_stopping=True
)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
Inference API
import requests
API_URL = "https://api-inference.huggingface.co/models/Bashaarat1/t5-small-arxiv-summarizer"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({"inputs": "summarize: Your paper text..."})
Training Details
Training Data
- Dataset: arXiv papers
- Size: 30,000 training samples
- Validation: 2,000 samples
- Test: 1,000 samples
Training Procedure
- Base Model: t5-small (60M parameters)
- Epochs: 3
- Batch Size: 8 (effective: 32 with gradient accumulation)
- Learning Rate: 5e-5
- Optimizer: AdamW (8-bit)
- Hardware: NVIDIA A100-80GB
- Training Time: ~3 hours
Hyperparameters
- max_input_length: 512
- max_target_length: 128
- num_beams: 4
- learning_rate: 5e-5
- warmup_steps: 500
- weight_decay: 0.01
Evaluation
Evaluated on 1,000 arXiv test papers:
| Metric | Base T5-small | Fine-tuned | Improvement |
|---|---|---|---|
| ROUGE-1 | 0.2200 | 0.2823 | +28.29% |
| ROUGE-2 | 0.0564 | 0.0826 | +46.45% |
| ROUGE-L | 0.1405 | 0.1796 | +27.85% |
Limitations
- Optimized for academic/scientific text
- May not perform as well on general-domain text
- Maximum input length: 512 tokens
- Works best with English text
Citation
If you use this model, please cite:
@misc{t5-arxiv-summarizer,
author = {Bashaarat1},
title = {T5-Small Fine-tuned for Academic Summarization},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Bashaarat1/t5-small-arxiv-summarizer}}
}
License
This model is released under the Apache 2.0 License (same as T5-small base model).
Contact
For questions or issues, please open an issue on the model repository.
Model trained and uploaded: December 2024
- Downloads last month
- 93