Model Summary: Mify-Coder-2.5B

Overview

Mify-Coder-2.5B-v1 is a breakthrough 2.5B-parameter code model fully designed, engineered, and trained at Infosys on 4.2T tokens on Mify-2.5B base model. Despite its compact size, Mify-Coder-2.5B-v1 sets a new benchmark for small language models, achieving performance parity with frontier open-source models in code generation and tool calling, along with exemplary performance on safety metrics in helpfulness and harmlessness, and superior throughput that surpasses larger frontier models.

Developed by: Infosys Ltd.


Architecture & Training

  • Base Model: Mify-2.5B
  • Training Phases:
    • Continual Pretraining (CPT): Next-token prediction with Fill-in-the-Middle (FIM) for structural infilling.
    • Supervised Fine-Tuning (SFT): Instruction alignment for coding tasks, function calling, and safety.
  • Optimization:
    • BF16 mixed precision, Grouped Query Attention (GQA), and Distributed Fused Adam optimizer.
    • Specialized tokenization with syntax markers and reasoning tokens for advanced behaviors.

Performance Highlights

Category Benchmark # Shots Metric Scores
Code Gen MBPP 0 pass@1 91.21%
Code Gen MBPP+ 0 pass@1 89.15%
Code Gen HumanEval 0 pass@1 53.66%
Code Gen HumanEval+ 0 pass@1 48.78%
Code Gen NumpyEval 0 pass@1 56.44%
Code Gen PandasEval 0 pass@1 53.47%
Tool Use BFCL v2 0 overall acc 55.26%
Safety AIR-Bench 0 pass@1 67.32%
SecCode Gen CybersecEval4-Autocomplete 0 pass@1 78.91%

Responsible AI & Safety

  • Integrated safety objectives during SFT.
  • Balanced harmful/general sample ratio (1:4) for secure code generation and ethical language use.
  • Validated against Stanford AIR-Bench and CybersecEval4-Autocomplete benchmarks.

Deployment & Future Work

  • Quantization: The model was optimized for low latency outperforming most sub-8B SLM models. Furthermore, the quantized variants of Mify-Coder can be seamlessly deployed and inferenced on standard desktop environments, eliminating the need for specialized hardware such as GPUs.
  • Future work includes enhancing Mify-Coder with agentic coding competencies and scaling its context length. The model weights will be open-sourced early next year to accelerate research and real-world deployment.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support