T5-Small Fine-tuned for Academic Paper Summarization

Model Description

This is a T5-small model fine-tuned on 30,000 arXiv papers for academic text summarization.

Performance

Compared to base T5-small:

  • ROUGE-1: +28.29% improvement
  • ROUGE-2: +46.45% improvement โญ
  • ROUGE-L: +27.85% improvement

Additional metrics:

  • BERTScore: +2.14% improvement
  • BARTScore: +6.62% improvement
  • FactCC: +28.24% improvement

Overall: 6/8 metrics improved (75% win rate)

Intended Use

This model is specifically designed for:

  • Summarizing academic papers
  • Generating abstracts from research articles
  • Scientific document summarization
  • Technical content summarization

How to Use

Quick Start

from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load model
tokenizer = T5Tokenizer.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer")
model = T5ForConditionalGeneration.from_pretrained("Bashaarat1/t5-small-arxiv-summarizer")

# Prepare input
text = "summarize: Your academic paper text here..."
inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)

# Generate summary
outputs = model.generate(
    **inputs,
    max_length=128,
    num_beams=4,
    early_stopping=True
)

summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)

Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/Bashaarat1/t5-small-arxiv-summarizer"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({"inputs": "summarize: Your paper text..."})

Training Details

Training Data

  • Dataset: arXiv papers
  • Size: 30,000 training samples
  • Validation: 2,000 samples
  • Test: 1,000 samples

Training Procedure

  • Base Model: t5-small (60M parameters)
  • Epochs: 3
  • Batch Size: 8 (effective: 32 with gradient accumulation)
  • Learning Rate: 5e-5
  • Optimizer: AdamW (8-bit)
  • Hardware: NVIDIA A100-80GB
  • Training Time: ~3 hours

Hyperparameters

- max_input_length: 512
- max_target_length: 128
- num_beams: 4
- learning_rate: 5e-5
- warmup_steps: 500
- weight_decay: 0.01

Evaluation

Evaluated on 1,000 arXiv test papers:

Metric Base T5-small Fine-tuned Improvement
ROUGE-1 0.2200 0.2823 +28.29%
ROUGE-2 0.0564 0.0826 +46.45%
ROUGE-L 0.1405 0.1796 +27.85%

Limitations

  • Optimized for academic/scientific text
  • May not perform as well on general-domain text
  • Maximum input length: 512 tokens
  • Works best with English text

Citation

If you use this model, please cite:

@misc{t5-arxiv-summarizer,
  author = {Bashaarat1},
  title = {T5-Small Fine-tuned for Academic Summarization},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Bashaarat1/t5-small-arxiv-summarizer}}
}

License

This model is released under the Apache 2.0 License (same as T5-small base model).

Contact

For questions or issues, please open an issue on the model repository.


Model trained and uploaded: December 2024

Downloads last month
93
Safetensors
Model size
60.5M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Spaces using Bashaarat1/t5-small-arxiv-summarizer 2