InfosysEnterprise
/

Mify-Coder-2.5B

Model card Files Files and versions

srkchowdary2000 commited on Nov 28, 2025

Commit

1da07b0

·

verified ·

1 Parent(s): a44c188

Update README.md

Files changed (1) hide show

README.md +51 -3

README.md CHANGED Viewed

@@ -1,3 +1,51 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+# **Model Summary: Mify-Coder-2.5B**
+## **Overview**
+Mify-Coder-2.5B-v0.1 is a **2.5B-parameter code-focused language model**. It delivers **frontier-grade performance** in code generation, reasoning, and function calling tasks while maintaining **compute efficiency and enterprise-grade safety**. Unlike scale-first paradigms, Mify-Coder demonstrates that smaller models can achieve competitive results through principled data curation and optimized training strategies.
+**Developed by**: Infosys Ltd.
+---
+## **Architecture & Training**
+- **Base Model:** Mify-2.5B
+- **Training Phases:**
+  - **Continual Pretraining (CPT):** Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
+  - **Supervised Fine-Tuning (SFT):** Instruction alignment for coding tasks, multi-turn dialogues, function calling, and safety.
+- **Optimization:**
+  - **BF16 mixed precision**, **Grouped Query Attention (GQA)**, and **Distributed Fused Adam** optimizer.
+  - Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.
+---
+## **Performance Highlights**
+| **Category**   | **Benchmark**       | **# Shots** | **Metric** | **Scores**   |
+|----------------|----------------------|-------------|------------|-------------------|
+| Code Gen       | MBPP                | 0           | pass@1     | 90.70%     |
+| Code Gen       | MBPP+               | 0           | pass@1     | 88.89%      |
+| Code Gen       | HumanEval           | 0           | pass@1     | 53.05%      |
+| Code Gen       | HumanEval+          | 0           | pass@1     | 46.95%     |
+| Code Gen       | NumpyEval           | 0           | pass@1     | 56.44%     |
+| Code Gen       | PandasEval          | 0           | pass@1     | 53.47%      |
+- Outperforms larger models on algorithmic reasoning tasks while maintaining competitive general coding and security-oriented capabilities.
+---
+## **Responsible AI & Safety**
+- Integrated safety objectives during SFT.
+- Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
+- Validated against **Stanford AirBench** and **CyberSecEval** benchmarks.
+---
+## **Deployment & Future Work**
+- **Quantization:** FP8 and AWQ for efficient inference; optimized with TensorRT-LLM.