XLM-R_WOR

Model Description

XLM-R_WOR is a Telugu sentiment classification model built on XLM-RoBERTa (XLM-R), a large-scale multilingual Transformer model developed by Facebook AI. XLM-R is designed to enhance cross-lingual understanding by leveraging a substantially larger and more diverse pretraining corpus than mBERT.

The base model is pretrained on approximately 2.5 TB of filtered Common Crawl data covering 100+ languages, including Telugu. Unlike mBERT, XLM-R is trained exclusively with the Masked Language Modeling (MLM) objective, without using the Next Sentence Prediction (NSP) task. This design choice enables stronger contextual representations and improved transfer learning.

The suffix WOR denotes Without Rationale supervision. This model is fine-tuned using only sentiment labels, without incorporating human-annotated rationales, and serves as a label-only baseline.

Pretraining Details

Pretraining corpus: Filtered Common Crawl (≈2.5 TB, 100+ languages)
Training objective:
- Masked Language Modeling (MLM)
Next Sentence Prediction: Not used
Language coverage: Telugu included, but not exclusively targeted

Training Data

Fine-tuning dataset: Telugu-Dataset
Task: Sentiment classification
Supervision type: Label-only (no rationale supervision)

Intended Use

This model is intended for:

Telugu sentiment classification
Cross-lingual and multilingual NLP benchmarking
Baseline comparisons for explainability and rationale-supervision studies
Low-resource Telugu NLP research

Due to its large-scale multilingual pretraining, XLM-R_WOR is particularly effective for transfer learning scenarios where Telugu-specific labeled data is limited.

Performance Characteristics

XLM-R generally provides stronger contextual modeling and improved downstream performance compared to mBERT, owing to its larger and more diverse pretraining corpus and exclusive focus on the MLM objective.

Strengths

Strong cross-lingual transfer learning
Improved contextual representations over mBERT
Reliable baseline for multilingual sentiment analysis

Limitations

Not explicitly optimized for Telugu morphology or syntax
May underperform compared to Telugu-specialized models such as MuRIL or L3Cube-Telugu-BERT
Limited ability to capture fine-grained cultural and regional linguistic nuances

Use as a Baseline

XLM-R_WOR serves as a robust and widely accepted baseline for:

Comparing multilingual models against Telugu-specialized architectures
Evaluating the impact of rationale supervision (WOR vs. WR)
Benchmarking sentiment classification performance in low-resource Telugu settings

References

Conneau et al., 2019
Hedderich et al., 2021
Kulkarni et al., 2021
Joshi, 2022
Das et al., 2022
Rajalakshmi et al., 2023

Downloads last month: 13

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for DSL-13-SRMAP/XLM-R_WOR

Base model

FacebookAI/xlm-roberta-base

Finetuned

(3713)

this model

DSL-13-SRMAP
/

XLM-R_WOR