XLM-R_WOR
Model Description
XLM-R_WOR is a Telugu sentiment classification model built on XLM-RoBERTa (XLM-R), a large-scale multilingual Transformer model developed by Facebook AI. XLM-R is designed to enhance cross-lingual understanding by leveraging a substantially larger and more diverse pretraining corpus than mBERT.
The base model is pretrained on approximately 2.5 TB of filtered Common Crawl data covering 100+ languages, including Telugu. Unlike mBERT, XLM-R is trained exclusively with the Masked Language Modeling (MLM) objective, without using the Next Sentence Prediction (NSP) task. This design choice enables stronger contextual representations and improved transfer learning.
The suffix WOR denotes Without Rationale supervision. This model is fine-tuned using only sentiment labels, without incorporating human-annotated rationales, and serves as a label-only baseline.
Pretraining Details
- Pretraining corpus: Filtered Common Crawl (≈2.5 TB, 100+ languages)
- Training objective:
- Masked Language Modeling (MLM)
- Next Sentence Prediction: Not used
- Language coverage: Telugu included, but not exclusively targeted
Training Data
- Fine-tuning dataset: Telugu-Dataset
- Task: Sentiment classification
- Supervision type: Label-only (no rationale supervision)
Intended Use
This model is intended for:
- Telugu sentiment classification
- Cross-lingual and multilingual NLP benchmarking
- Baseline comparisons for explainability and rationale-supervision studies
- Low-resource Telugu NLP research
Due to its large-scale multilingual pretraining, XLM-R_WOR is particularly effective for transfer learning scenarios where Telugu-specific labeled data is limited.
Performance Characteristics
XLM-R generally provides stronger contextual modeling and improved downstream performance compared to mBERT, owing to its larger and more diverse pretraining corpus and exclusive focus on the MLM objective.
Strengths
- Strong cross-lingual transfer learning
- Improved contextual representations over mBERT
- Reliable baseline for multilingual sentiment analysis
Limitations
- Not explicitly optimized for Telugu morphology or syntax
- May underperform compared to Telugu-specialized models such as MuRIL or L3Cube-Telugu-BERT
- Limited ability to capture fine-grained cultural and regional linguistic nuances
Use as a Baseline
XLM-R_WOR serves as a robust and widely accepted baseline for:
- Comparing multilingual models against Telugu-specialized architectures
- Evaluating the impact of rationale supervision (WOR vs. WR)
- Benchmarking sentiment classification performance in low-resource Telugu settings
References
- Conneau et al., 2019
- Hedderich et al., 2021
- Kulkarni et al., 2021
- Joshi, 2022
- Das et al., 2022
- Rajalakshmi et al., 2023
- Downloads last month
- 13
Model tree for DSL-13-SRMAP/XLM-R_WOR
Base model
FacebookAI/xlm-roberta-base