sentiment-imdb-distilbert

Model Description

Fine-tuned distilbert-base-uncased for binary sentiment classification on the IMDB movie review dataset.

Training Date: 2025-11-18 14:27:43

Intended Use

This model classifies text into positive or negative sentiment. It was trained on movie reviews but may generalize to other domains.

Performance

Metric	Score
Accuracy	0.9304
F1 Score	0.9306
Precision	0.9283
Recall	0.9330

Training Hyperparameters

Model: distilbert-base-uncased
Epochs: 4
Batch Size: 32
Learning Rate: 2e-05
Max Length: 512
Warmup Ratio: 0.1
Weight Decay: 0.01
Training Samples: Full dataset (25,000)

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "None/sentiment-imdb-distilbert" if args.hf_username else "local model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def predict(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
    return "Positive" if probs[0][1] > 0.5 else "Negative"

print(predict("This movie was amazing!"))  # Positive

Limitations

Trained primarily on movie reviews; performance may vary on other text types
May reflect biases present in the IMDB dataset
English language only

Citation

@misc{sentiment-imdb-distilbert,
  author = {Your Name},
  title = {Sentiment Analysis with distilbert-base-uncased},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/None/sentiment-imdb-distilbert}}
}

Downloads last month: 23

Safetensors

Model size

67M params

Tensor type

F32

Dataset used to train PierrunoYT/sentiment-imdb-distilbert

Evaluation results

Accuracy on IMDB
self-reported

0.930
F1 Score on IMDB
self-reported

0.931

View on Papers With Code