news_text_summarizer / README.md

Fathi7ma

Update README.md

4414ea1 verified about 1 month ago

preview code

raw

history blame contribute delete

3.01 kB

metadata

language: en
library_name: transformers
license: apache-2.0
base_model: sshleifer/distilbart-cnn-12-6
tags:
  - summarization
  - text-generation
  - fine-tuned-model
  - bart
model-index:
  - name: General Text Summarizer
    results:
      - task:
          type: summarization
          name: Text Summarization
        dataset:
          name: CNN/DailyMail
          type: cnn_dailymail
        metrics:
          - name: Rouge1
            type: rouge
            value: 36.61
          - name: Rouge2
            type: rouge
            value: 16.51
          - name: RougeL
            type: rouge
            value: 26.24
          - name: RougeLsum
            type: rouge
            value: 33.45

🧠 General Text Summarizer

This model is a fine-tuned version of sshleifer/distilbart-cnn-12-6, trained to generate concise and fluent summaries of general English text — including news articles, essays, stories, and blog posts.

🚀 Model Description

Base model: DistilBART (CNN/DailyMail)
Framework: 🤗 Transformers (PyTorch)
Training goal: Summarize text across multiple domains (not limited to one topic)
Device optimized: CPU & Apple M-series chips (MPS compatible)

This model is suitable for lightweight summarization tasks on laptops or limited-resource machines.

🧾 Example Usage

from transformers import pipeline

summarizer = pipeline("summarization", model="Fathi7ma/general_text_summarizer_cpu")

text = """ Climate change continues to affect weather patterns across the globe. Scientists warn that without immediate action, rising temperatures may lead to irreversible damage to ecosystems and human livelihoods. """

summary = summarizer(text, max_length=80, min_length=25, do_sample=False) print(summary[0]['summary_text'])

Intended uses

This model can summarize: • News articles • Research abstracts • Reports and blogs • Long paragraphs of general English text

Example domains: general news, education, business summaries, and everyday content.

Training

•	Dataset: A subset of CNN/DailyMail, filtered and balanced for general summarization.
•	Approx. 10,000 samples used for CPU-efficient fine-tuning.
•	Texts are trimmed and normalized for readability.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
2.2534	1.0	600	2.1023	36.61	16.51	26.24	33.45

Framework versions

Transformers 4.57.1
Pytorch 2.9.0
Datasets 4.3.0
Tokenizers 0.22.1