KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1
A specialized LoRA fine-tuned adapter built on top of Kwaipilot/KAT-Dev (32B), designed for deep understanding of the Hyperswitch (Rust) payment orchestration codebase. This model uses a 3-phase curriculum training pipeline, progressively enhancing the model's grasp of Rust patterns, PR changes, repository structure, and payment processing logic.
π Overview
This LoRA adapter was trained with a phased CPT strategy:
Phase 1 β Foundation
Learns core repository structure, Rust syntax, basic modules, and Hyperswitch architectural patterns.
Phase 2 β Evolution
Exposes the model to progressively complex components, multi-file interactions, workflows, and feature evolution.
Phase 3 β PR Mastery
Specializes on real PR changes, diffs, refactors, and reasoning across multi-module changes.
The final result is a high-signal Rust-aware, Hyperswitch-specialized LoRA adapter ideal for:
- Code generation
- Code explanation
- PR reasoning
- Diff summarization
- Documentation generation
- Rust workflow automation
π§ Training Details
LoRA Configuration
r: 128
alpha: 256
dropout: 0.05
target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- up_proj
- down_proj
Hyperparameters
learning_rate: 1e-4
micro_batch_size: 1
gradient_accumulation_steps: 6
sequence_length: 32768
train/val split: 95/5
precision: bf16
Hardware
num_gpus: 8
gpu_name: NVIDIA H200
π Phased Training Metrics
Phase 1 β Foundation
Dataset: phase1_foundation.jsonl
Epochs: 3
| Metric | Train | Eval |
|---|---|---|
| Loss | 0.2918 | 0.2434 |
| Entropy | 0.2052 | 0.2355 |
| Mean Token Accuracy | 0.9505 | 0.9331 |
| Perplexity | β | 1.2756 |
| Tokens | 8.88M | 8.88M |
Phase 2 β Evolution
Dataset: phase2_evolution.jsonl
Epochs: 2
| Metric | Train | Eval |
|---|---|---|
| Loss | 0.7255 | 0.7661 |
| Entropy | 0.5080 | 0.7210 |
| Mean Token Accuracy | 0.8641 | 0.8110 |
| Perplexity | β | 2.1514 |
| Tokens | 23.48M | 23.48M |
Phase 3 β PR Mastery
Dataset: phase3_pr_mastery.jsonl
Epochs: 2
| Metric | Train | Eval |
|---|---|---|
| Loss | 0.5378 | 0.5606 |
| Entropy | 0.4781 | 0.5254 |
| Mean Token Accuracy | 0.8749 | 0.8569 |
| Perplexity | β | 1.7516 |
| Tokens | 15.45M | 15.45M |
π Summary Across All Phases
total_epochs: 7
total_phases: 3
initial_train_loss: 0.2918
final_train_loss: 0.5378
initial_eval_loss: 0.2434
final_eval_loss: 0.5606
initial_perplexity: 1.2756
final_perplexity: 1.7516
π Acknowledgments
- Kwaipilot Team β For the excellent KAT-Dev 32B base model
- Juspay / Hyperswitch β For the rich open-source Rust codebase
- Hugging Face β For PEFT, TRL, and Transformers
π Citation
@misc{katdev-hyperswitch-phasedlora-2025,
title={KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1},
author={Aditya Narayan},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1}
}
Model tree for AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1
Base model
Kwaipilot/KAT-Dev