--- license: other license_name: lfm1.0 license_link: LICENSE language: - ar base_model: LiquidAI/LFM2-1.2B-RAG tags: - arabic - rag - question-answering - fine-tuned - adalora - liquid - extractive-qa datasets: - hsseinmz/arcd library_name: transformers pipeline_tag: question-answering --- # LFM2-1.2B-RAG Arabic (AdaLoRA Fine-tuned) Fine-tuned version of [LiquidAI/LFM2-1.2B-RAG](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG) for Arabic reading comprehension and question answering tasks using **AdaLoRA (Adaptive Low-Rank Adaptation)** technique. ## 🏆 Performance ### Arabic Broad Benchmark (ABB) - Local Evaluation Evaluated using the official [ABB benchmark](https://huggingface.co/datasets/silma-ai/arabic-broad-benchmark) evaluation [script](https://huggingface.co/datasets/silma-ai/arabic-broad-benchmark/blob/main/abb_eval.py) on RAG QA category: | Metric | Score | |--------|-------| | **RAG QA** | **5.39/10** | | Test Questions | 41 | | Focus | RAG QA Category | **Performance Context:** Comparing with publicly reported scores from the [ABL Leaderboard](https://huggingface.co/spaces/silma-ai/Arabic-LLM-Broad-Leaderboard) "🏅 Top by Skill → RAG QA" section: | Model | Size | RAG QA Score | Difference | |-------|------|--------------|------------| | ibm-granite/granite-3.3-8b-instruct | 8B | 5.49 | -0.10 | | openai/gpt-4.1-nano | Large | 5.41 | -0.02 | | **This model (local eval)** | **1.2B** | **5.39** | **baseline** | | meta-llama/Llama-3.1-8B-Instruct | 8B | 5.02 | +0.37 | | microsoft/Phi-4-mini-instruct | Small | 4.93 | +0.46 | | openai/gpt-oss-20b | 20B | 4.32 | +1.07 | | inceptionai/jais-adapted-13b-chat | 13B | 4.1 | +1.29 | **Key Achievement:** Competitive RAG performance with only **1.2B parameters** - significantly smaller than most comparable models, making it ideal for edge deployment and resource-constrained environments. *Note: This is a local evaluation. The official leaderboard submission has not been made yet.* ## 📋 Model Description This model specializes in extractive question answering for Arabic text with adaptive parameter allocation. It has been fine-tuned on the Arabic Reading Comprehension Dataset (ARCD) using AdaLoRA, which dynamically adjusts the rank of different layers during training for optimal performance. **Key Features:** - Optimized for Arabic extractive QA with adaptive rank allocation - Context-based question answering with high faithfulness - Balanced performance across multiple evaluation metrics - Parameter-efficient fine-tuning via AdaLoRA ## 🎯 Intended Use ### Direct Use - Arabic question answering systems - RAG (Retrieval-Augmented Generation) applications for Arabic content - Information extraction from Arabic documents - Educational tools for Arabic reading comprehension - Chatbots requiring grounded Arabic responses ### Downstream Use Can be further fine-tuned for: - Domain-specific QA (medical, legal, financial) - Multi-turn conversational QA - Cross-lingual QA systems - Document analysis pipelines ### Out-of-Scope Use **Not recommended for:** - Open-domain question answering without context - Creative writing or story generation - Machine translation - Code generation or technical programming tasks ## 🚀 How to Use ### Basic Usage ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Load model and tokenizer model_id = "azeddinShr/LFM2-1.2B-RAG-ARABIC-AdaLoRA" model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", torch_dtype=torch.bfloat16, ) tokenizer = AutoTokenizer.from_pretrained(model_id) # Prepare input context = "نيوم هو مشروع ضخم في شمال غرب السعودية بتكلفة 500 مليار دولار." question = "ما هي تكلفة مشروع نيوم؟" prompt = f"استخدم السياق التالي للإجابة على السؤال:\n\n{context}\n\nالسؤال: {question}" # Generate answer messages = [{"role": "user", "content": prompt}] input_ids = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ).to(model.device) with torch.no_grad(): outputs = model.generate( input_ids, max_new_tokens=150, temperature=0.0, do_sample=False, pad_token_id=tokenizer.eos_token_id ) answer = tokenizer.decode(outputs[0][input_ids.shape[1]:], skip_special_tokens=True) print(answer) # Output: 500 مليار دولار ``` ## 📊 Training Details ### Training Data - **Dataset:** [hsseinmz/arcd](https://huggingface.co/datasets/hsseinmz/arcd) - **Training samples:** 693 - **Validation samples:** 351 - **Test samples:** 351 - **Language:** Modern Standard Arabic - **Task:** Extractive question answering ### Training Procedure **Fine-tuning method:** AdaLoRA (Adaptive Low-Rank Adaptation) **Hyperparameters:** - **Base model:** LiquidAI/LFM2-1.2B-RAG - **Epochs:** 10 - **Batch size:** 16 (4 per device × 4 gradient accumulation) - **Learning rate:** 2e-4 - **Optimizer:** AdamW (8-bit paged) - **LR scheduler:** Cosine - **Warmup steps:** 50 - **Weight decay:** 0.01 **AdaLoRA Configuration:** - **Initial rank (r):** 16 - **Target average rank:** 8 - **Initial adapter rank:** 12 - **LoRA alpha:** 32 - **LoRA dropout:** 0.05 - **Pruning start step (tinit):** 10% of total steps - **Pruning end step (tfinal):** 70% of total steps - **Pruning frequency (deltaT):** 10 steps - **Importance smoothing (beta1, beta2):** 0.85 - **Orthogonality regularization:** 0.5 - **Target modules:** w1, w2, w3, q_proj, k_proj, v_proj, out_proj, in_proj **Training infrastructure:** - Precision: bfloat16 - Gradient checkpointing: Enabled - Framework: Hugging Face Transformers + PEFT + TRL ## 🔒 Ethical Considerations - This model should not be used to generate misleading information or propaganda - Outputs should be fact-checked for critical applications - The model reflects statistical patterns in training data and may not represent complete or unbiased knowledge - Users are responsible for ensuring appropriate use in their applications ## 🔬 Technical Details ### What is AdaLoRA? AdaLoRA (Adaptive Low-Rank Adaptation) extends LoRA by dynamically allocating parameter budgets across different weight matrices based on their importance during training. This results in: - More efficient parameter usage - Better performance with fewer trainable parameters - Automatic pruning of less important adaptations ### Advantages over standard LoRA - Adaptive rank allocation based on importance scores - Better performance-efficiency trade-off - More stable training dynamics ## 📜 Citation If you use this model in your research or application, please cite: ```bibtex @misc{lfm2-arabic-qa-adalora, author = {Azeddin Sahir}, title = {LFM2-1.2B-RAG Arabic (AdaLoRA Fine-tuned)}, year = {2025}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/azeddinShr/lfm2-1.2b-arabic-qa-adalora}} } ``` ## 👍🏻 Acknowledgments - **Base Model:** [LiquidAI](https://www.liquid.ai/) for LFM2-1.2B-RAG - **Dataset:** [ARCD](https://huggingface.co/datasets/hsseinmz/arcd) - Arabic Reading Comprehension Dataset - **Framework:** Hugging Face Transformers, PEFT, TRL - **Method:** AdaLoRA by Zhang et al. ## 📄 License Base model License ## 📧 Contact For questions, issues, or collaboration opportunities, please open an issue in the model repository, contact via Hugging Face, or email me directly at [azdinsahir11@gmail.com](mailto:azdinsahir11@gmail.com). --- **Note:** This is a research model. Always validate outputs for your specific use case and domain.