---
license: other
license_name: lfm1.0
license_link: LICENSE
language:
- ar
base_model: LiquidAI/LFM2-1.2B-RAG
tags:
- arabic
- rag
- question-answering
- fine-tuned
- adalora
- liquid
- extractive-qa
datasets:
- hsseinmz/arcd
library_name: transformers
pipeline_tag: question-answering
---

# LFM2-1.2B-RAG Arabic (AdaLoRA Fine-tuned)

Fine-tuned version of [LiquidAI/LFM2-1.2B-RAG](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG) for Arabic reading comprehension and question answering tasks using **AdaLoRA (Adaptive Low-Rank Adaptation)** technique.

## 🏆 Performance

### Arabic Broad Benchmark (ABB) - Local Evaluation

Evaluated using the official [ABB benchmark](https://huggingface.co/datasets/silma-ai/arabic-broad-benchmark) evaluation [script](https://huggingface.co/datasets/silma-ai/arabic-broad-benchmark/blob/main/abb_eval.py) on RAG QA category:

| Metric | Score |
|--------|-------|
| **RAG QA** | **5.39/10** |
| Test Questions | 41 |
| Focus | RAG QA Category |

**Performance Context:**

Comparing with publicly reported scores from the [ABL Leaderboard](https://huggingface.co/spaces/silma-ai/Arabic-LLM-Broad-Leaderboard) "🏅 Top by Skill → RAG QA" section:

| Model | Size | RAG QA Score | Difference |
|-------|------|--------------|------------|
| ibm-granite/granite-3.3-8b-instruct | 8B | 5.49 | -0.10 |
| openai/gpt-4.1-nano | Large | 5.41 | -0.02 |
| **This model (local eval)** | **1.2B** | **5.39** | **baseline** |
| meta-llama/Llama-3.1-8B-Instruct | 8B | 5.02 | +0.37 |
| microsoft/Phi-4-mini-instruct | Small | 4.93 | +0.46 |
| openai/gpt-oss-20b | 20B | 4.32 | +1.07 |
| inceptionai/jais-adapted-13b-chat | 13B | 4.1 | +1.29 |

**Key Achievement:** Competitive RAG performance with only **1.2B parameters** - significantly smaller than most comparable models, making it ideal for edge deployment and resource-constrained environments.

*Note: This is a local evaluation. The official leaderboard submission has not been made yet.*

## 📋 Model Description

This model specializes in extractive question answering for Arabic text with adaptive parameter allocation. It has been fine-tuned on the Arabic Reading Comprehension Dataset (ARCD) using AdaLoRA, which dynamically adjusts the rank of different layers during training for optimal performance.

**Key Features:**
- Optimized for Arabic extractive QA with adaptive rank allocation
- Context-based question answering with high faithfulness
- Balanced performance across multiple evaluation metrics
- Parameter-efficient fine-tuning via AdaLoRA

## 🎯 Intended Use

### Direct Use
- Arabic question answering systems
- RAG (Retrieval-Augmented Generation) applications for Arabic content
- Information extraction from Arabic documents
- Educational tools for Arabic reading comprehension
- Chatbots requiring grounded Arabic responses

### Downstream Use
Can be further fine-tuned for:
- Domain-specific QA (medical, legal, financial)
- Multi-turn conversational QA
- Cross-lingual QA systems
- Document analysis pipelines

### Out-of-Scope Use
**Not recommended for:**
- Open-domain question answering without context
- Creative writing or story generation
- Machine translation
- Code generation or technical programming tasks

## 🚀 How to Use

### Basic Usage

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_id = "azeddinShr/LFM2-1.2B-RAG-ARABIC-AdaLoRA"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Prepare input
context = "نيوم هو مشروع ضخم في شمال غرب السعودية بتكلفة 500 مليار دولار."
question = "ما هي تكلفة مشروع نيوم؟"

prompt = f"استخدم السياق التالي للإجابة على السؤال:\n\n{context}\n\nالسؤال: {question}"

# Generate answer
messages = [{"role": "user", "content": prompt}]
input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        input_ids,
        max_new_tokens=150,
        temperature=0.0,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

answer = tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True)
print(answer)  # Output: 500 مليار دولار
```


## 📊 Training Details

### Training Data
- **Dataset:** [hsseinmz/arcd](https://huggingface.co/datasets/hsseinmz/arcd)
- **Training samples:** 693
- **Validation samples:** 351
- **Test samples:** 351
- **Language:** Modern Standard Arabic
- **Task:** Extractive question answering

### Training Procedure

**Fine-tuning method:** AdaLoRA (Adaptive Low-Rank Adaptation)

**Hyperparameters:**
- **Base model:** LiquidAI/LFM2-1.2B-RAG
- **Epochs:** 10
- **Batch size:** 16 (4 per device × 4 gradient accumulation)
- **Learning rate:** 2e-4
- **Optimizer:** AdamW (8-bit paged)
- **LR scheduler:** Cosine
- **Warmup steps:** 50
- **Weight decay:** 0.01

**AdaLoRA Configuration:**
- **Initial rank (r):** 16
- **Target average rank:** 8
- **Initial adapter rank:** 12
- **LoRA alpha:** 32
- **LoRA dropout:** 0.05
- **Pruning start step (tinit):** 10% of total steps
- **Pruning end step (tfinal):** 70% of total steps
- **Pruning frequency (deltaT):** 10 steps
- **Importance smoothing (beta1, beta2):** 0.85
- **Orthogonality regularization:** 0.5
- **Target modules:** w1, w2, w3, q_proj, k_proj, v_proj, out_proj, in_proj

**Training infrastructure:**
- Precision: bfloat16
- Gradient checkpointing: Enabled
- Framework: Hugging Face Transformers + PEFT + TRL

## 🔒 Ethical Considerations

- This model should not be used to generate misleading information or propaganda
- Outputs should be fact-checked for critical applications
- The model reflects statistical patterns in training data and may not represent complete or unbiased knowledge
- Users are responsible for ensuring appropriate use in their applications

## 🔬 Technical Details

### What is AdaLoRA?

AdaLoRA (Adaptive Low-Rank Adaptation) extends LoRA by dynamically allocating parameter budgets across different weight matrices based on their importance during training. This results in:
- More efficient parameter usage
- Better performance with fewer trainable parameters
- Automatic pruning of less important adaptations

### Advantages over standard LoRA
- Adaptive rank allocation based on importance scores
- Better performance-efficiency trade-off
- More stable training dynamics

## 📜 Citation

If you use this model in your research or application, please cite:

```bibtex
@misc{lfm2-arabic-qa-adalora,
  author = {Azeddin Sahir},
  title = {LFM2-1.2B-RAG Arabic (AdaLoRA Fine-tuned)},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/azeddinShr/lfm2-1.2b-arabic-qa-adalora}}
}
```

## 👍🏻 Acknowledgments

- **Base Model:** [LiquidAI](https://www.liquid.ai/) for LFM2-1.2B-RAG
- **Dataset:** [ARCD](https://huggingface.co/datasets/hsseinmz/arcd) - Arabic Reading Comprehension Dataset
- **Framework:** Hugging Face Transformers, PEFT, TRL
- **Method:** AdaLoRA by Zhang et al.

## 📄 License

Base model License

## 📧 Contact

For questions, issues, or collaboration opportunities, please open an issue in the model repository, contact via Hugging Face, or email me directly at [azdinsahir11@gmail.com](mailto:azdinsahir11@gmail.com).

---

**Note:** This is a research model. Always validate outputs for your specific use case and domain.