XLM-R_WOR

Model Description

XLM-R_WOR is a Telugu sentiment classification model built on XLM-RoBERTa (XLM-R), a large-scale multilingual Transformer model developed by Facebook AI. XLM-R is designed to enhance cross-lingual understanding by leveraging a substantially larger and more diverse pretraining corpus than mBERT.

The base model is pretrained on approximately 2.5 TB of filtered Common Crawl data covering 100+ languages, including Telugu. Unlike mBERT, XLM-R is trained exclusively with the Masked Language Modeling (MLM) objective, without using the Next Sentence Prediction (NSP) task. This design choice enables stronger contextual representations and improved transfer learning.

The suffix WOR denotes Without Rationale supervision. This model is fine-tuned using only sentiment labels, without incorporating human-annotated rationales, and serves as a label-only baseline.


Pretraining Details

  • Pretraining corpus: Filtered Common Crawl (≈2.5 TB, 100+ languages)
  • Training objective:
    • Masked Language Modeling (MLM)
  • Next Sentence Prediction: Not used
  • Language coverage: Telugu included, but not exclusively targeted

Training Data

  • Fine-tuning dataset: Telugu-Dataset
  • Task: Sentiment classification
  • Supervision type: Label-only (no rationale supervision)

Intended Use

This model is intended for:

  • Telugu sentiment classification
  • Cross-lingual and multilingual NLP benchmarking
  • Baseline comparisons for explainability and rationale-supervision studies
  • Low-resource Telugu NLP research

Due to its large-scale multilingual pretraining, XLM-R_WOR is particularly effective for transfer learning scenarios where Telugu-specific labeled data is limited.


Performance Characteristics

XLM-R generally provides stronger contextual modeling and improved downstream performance compared to mBERT, owing to its larger and more diverse pretraining corpus and exclusive focus on the MLM objective.

Strengths

  • Strong cross-lingual transfer learning
  • Improved contextual representations over mBERT
  • Reliable baseline for multilingual sentiment analysis

Limitations

  • Not explicitly optimized for Telugu morphology or syntax
  • May underperform compared to Telugu-specialized models such as MuRIL or L3Cube-Telugu-BERT
  • Limited ability to capture fine-grained cultural and regional linguistic nuances

Use as a Baseline

XLM-R_WOR serves as a robust and widely accepted baseline for:

  • Comparing multilingual models against Telugu-specialized architectures
  • Evaluating the impact of rationale supervision (WOR vs. WR)
  • Benchmarking sentiment classification performance in low-resource Telugu settings

References

  • Conneau et al., 2019
  • Hedderich et al., 2021
  • Kulkarni et al., 2021
  • Joshi, 2022
  • Das et al., 2022
  • Rajalakshmi et al., 2023
Downloads last month
13
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DSL-13-SRMAP/XLM-R_WOR

Finetuned
(3713)
this model

Dataset used to train DSL-13-SRMAP/XLM-R_WOR