KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1

A specialized LoRA fine-tuned adapter built on top of Kwaipilot/KAT-Dev (32B), designed for deep understanding of the Hyperswitch (Rust) payment orchestration codebase. This model uses a 3-phase curriculum training pipeline, progressively enhancing the model's grasp of Rust patterns, PR changes, repository structure, and payment processing logic.

🚀 Overview

This LoRA adapter was trained with a phased CPT strategy:

Phase 1 — Foundation

Learns core repository structure, Rust syntax, basic modules, and Hyperswitch architectural patterns.

Phase 2 — Evolution

Exposes the model to progressively complex components, multi-file interactions, workflows, and feature evolution.

Phase 3 — PR Mastery

Specializes on real PR changes, diffs, refactors, and reasoning across multi-module changes.

The final result is a high-signal Rust-aware, Hyperswitch-specialized LoRA adapter ideal for:

Code generation
Code explanation
PR reasoning
Diff summarization
Documentation generation
Rust workflow automation

🔧 Training Details

LoRA Configuration

r: 128
alpha: 256
dropout: 0.05
target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj

Hyperparameters

learning_rate: 1e-4
micro_batch_size: 1
gradient_accumulation_steps: 6
sequence_length: 32768
train/val split: 95/5
precision: bf16

Hardware

num_gpus: 8
gpu_name: NVIDIA H200

📊 Phased Training Metrics

Phase 1 — Foundation

Dataset: phase1_foundation.jsonl
Epochs: 3

Metric	Train	Eval
Loss	0.2918	0.2434
Entropy	0.2052	0.2355
Mean Token Accuracy	0.9505	0.9331
Perplexity	—	1.2756
Tokens	8.88M	8.88M

Phase 2 — Evolution

Dataset: phase2_evolution.jsonl
Epochs: 2

Metric	Train	Eval
Loss	0.7255	0.7661
Entropy	0.5080	0.7210
Mean Token Accuracy	0.8641	0.8110
Perplexity	—	2.1514
Tokens	23.48M	23.48M

Phase 3 — PR Mastery

Dataset: phase3_pr_mastery.jsonl
Epochs: 2

Metric	Train	Eval
Loss	0.5378	0.5606
Entropy	0.4781	0.5254
Mean Token Accuracy	0.8749	0.8569
Perplexity	—	1.7516
Tokens	15.45M	15.45M

📈 Summary Across All Phases

total_epochs: 7
total_phases: 3

initial_train_loss: 0.2918
final_train_loss: 0.5378

initial_eval_loss: 0.2434
final_eval_loss: 0.5606

initial_perplexity: 1.2756
final_perplexity: 1.7516

🙏 Acknowledgments

Kwaipilot Team — For the excellent KAT-Dev 32B base model
Juspay / Hyperswitch — For the rich open-source Rust codebase
Hugging Face — For PEFT, TRL, and Transformers

📚 Citation

@misc{katdev-hyperswitch-phasedlora-2025,
  title={KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1},
  author={Aditya Narayan},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1

Base model

Kwaipilot/KAT-Dev

Finetuned

(4)

this model

AdityaNarayan
/

KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1

KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1

🚀 Overview

Phase 1 — Foundation

Phase 2 — Evolution

Phase 3 — PR Mastery

🔧 Training Details

LoRA Configuration

Hyperparameters

Hardware

📊 Phased Training Metrics

Phase 1 — Foundation

Phase 2 — Evolution

Phase 3 — PR Mastery

📈 Summary Across All Phases

🙏 Acknowledgments

📚 Citation

Model tree for AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1

Dataset used to train AdityaNarayan/KAT-Dev-CPT-LoRA-HS-32K-maxToken-v1