srkchowdary2000 commited on
Commit
1da07b0
·
verified ·
1 Parent(s): a44c188

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -3
README.md CHANGED
@@ -1,3 +1,51 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # **Model Summary: Mify-Coder-2.5B**
6
+
7
+ ## **Overview**
8
+ Mify-Coder-2.5B-v0.1 is a **2.5B-parameter code-focused language model**. It delivers **frontier-grade performance** in code generation, reasoning, and function calling tasks while maintaining **compute efficiency and enterprise-grade safety**. Unlike scale-first paradigms, Mify-Coder demonstrates that smaller models can achieve competitive results through principled data curation and optimized training strategies.
9
+
10
+ **Developed by**: Infosys Ltd.
11
+
12
+ ---
13
+
14
+ ## **Architecture & Training**
15
+ - **Base Model:** Mify-2.5B
16
+ - **Training Phases:**
17
+ - **Continual Pretraining (CPT):** Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
18
+ - **Supervised Fine-Tuning (SFT):** Instruction alignment for coding tasks, multi-turn dialogues, function calling, and safety.
19
+ - **Optimization:**
20
+ - **BF16 mixed precision**, **Grouped Query Attention (GQA)**, and **Distributed Fused Adam** optimizer.
21
+ - Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.
22
+
23
+ ---
24
+
25
+ ## **Performance Highlights**
26
+
27
+ | **Category** | **Benchmark** | **# Shots** | **Metric** | **Scores** |
28
+ |----------------|----------------------|-------------|------------|-------------------|
29
+ | Code Gen | MBPP | 0 | pass@1 | 90.70% |
30
+ | Code Gen | MBPP+ | 0 | pass@1 | 88.89% |
31
+ | Code Gen | HumanEval | 0 | pass@1 | 53.05% |
32
+ | Code Gen | HumanEval+ | 0 | pass@1 | 46.95% |
33
+ | Code Gen | NumpyEval | 0 | pass@1 | 56.44% |
34
+ | Code Gen | PandasEval | 0 | pass@1 | 53.47% |
35
+
36
+ - Outperforms larger models on algorithmic reasoning tasks while maintaining competitive general coding and security-oriented capabilities.
37
+
38
+ ---
39
+
40
+ ## **Responsible AI & Safety**
41
+ - Integrated safety objectives during SFT.
42
+ - Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
43
+ - Validated against **Stanford AirBench** and **CyberSecEval** benchmarks.
44
+
45
+ ---
46
+
47
+ ## **Deployment & Future Work**
48
+ - **Quantization:** FP8 and AWQ for efficient inference; optimized with TensorRT-LLM.
49
+
50
+
51
+