ernie-45-math-finetuned / README.md

naazimsnh02

Upload ERNIE-4.5 math fine-tuned model - 707 steps, val_loss=0.6115

01a00e0 verified 7 days ago

preview code

raw

history blame contribute delete

5.3 kB

metadata

language:
  - en
license: mit
tags:
  - ernie
  - ernie-4.5
  - math
  - reasoning
  - unsloth
  - lora
  - fine-tuned
datasets:
  - nvidia/Nemotron-RL-math-OpenMathReasoning
base_model: unsloth/ERNIE-4.5-21B-A3B-PT
metrics:
  - loss
model-index:
  - name: naazimsnh02/ernie-45-math-finetuned
    results:
      - task:
          type: text-generation
          name: Mathematical Reasoning
        dataset:
          name: Nemotron-RL-math-OpenMathReasoning
          type: nvidia/Nemotron-RL-math-OpenMathReasoning
        metrics:
          - type: loss
            value: 0.6046
            name: Final Training Loss
          - type: loss
            value: 0.6114514470100403
            name: Final Validation Loss
          - type: loss
            value: 0.6114514470100403
            name: Best Validation Loss

ERNIE-4.5 Fine-tuned for Mathematical Reasoning

This model is a fine-tuned version of unsloth/ERNIE-4.5-21B-A3B-PT on the nvidia/Nemotron-RL-math-OpenMathReasoning dataset.

Model Description

This model specializes in solving complex mathematical problems including:

Algebra (equations, factoring, systems)
Calculus (derivatives, integrals)
Geometry and trigonometry
Word problems requiring multi-step reasoning
Competition-level mathematics

Training Details

Training Data

Dataset: nvidia/Nemotron-RL-math-OpenMathReasoning
Training Samples: 7,600
Evaluation Samples: 400
Format: Conversational (ERNIE-4.5 format)

Training Configuration

Base Model: unsloth/ERNIE-4.5-21B-A3B-PT (21B parameters)
Method: QLoRA (4-bit quantization + LoRA)
LoRA Rank: 16
LoRA Alpha: 16
Trainable Parameters: 355,090,432 (3.11% of total)

Hyperparameters

Batch Size: 4 (per device)
Gradient Accumulation: 2
Effective Batch Size: 8
Learning Rate: 0.0002
LR Scheduler: Cosine with warmup
Warmup Ratio: 0.05
Training Steps: 707 (stopped early for optimal performance)
Optimizer: AdamW 8-bit
Precision: BF16

Training Results

Final Training Loss: 0.6046
Final Validation Loss: 0.6115
Best Validation Loss: 0.6115
Loss Improvement: 9.2% (from 0.6732 to 0.6115)
Training Time: 4.64 hours
GPU: NVIDIA A100-SXM4-40GB
Peak Memory: 19.375 GB / 39.494 GB (49.058%)

Framework

Unsloth: 2x faster training, 70% less memory
Modal: Serverless GPU infrastructure (40GB A100)
Transformers: 4.56.2
TRL: 0.22.2

Usage

from unsloth import FastModel

# Load the fine-tuned model
model, tokenizer = FastModel.from_pretrained(
    model_name="naazimsnh02/ernie-45-math-finetuned",
    max_seq_length=2048,
    load_in_4bit=True,
    full_finetuning=False,
)

# Prepare for inference
FastModel.for_inference(model)

# Solve a math problem
messages = [{
    "role": "user",
    "content": "Solve the equation: 2x² + 5x - 3 = 0"
}]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", padding=True).to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Example Output

Input:

Solve the equation: x² + 5x + 6 = 0

Output:

To solve x² + 5x + 6 = 0, we can factor:

Find two numbers that multiply to 6 and add to 5:
2 and 3 work because 2 × 3 = 6 and 2 + 3 = 5

Factored form:
(x + 2)(x + 3) = 0

Setting each factor to zero:
x + 2 = 0  →  x = -2
x + 3 = 0  →  x = -3

Therefore: \boxed{x = -2, -3}

Training Progress

Step	Training Loss	Validation Loss
100	0.589	0.673
200	0.661	0.648
300	0.637	0.646
400	0.557	0.640
500	0.587	0.633
600	0.589	0.617
700	0.605	0.611

Training stopped at step 700 for optimal validation loss.

Training Infrastructure

Platform: Modal (modal.com)
GPU: 40GB A100
Training Duration: ~4.6 hours
Checkpointing: Every 100 steps
Evaluation: Every 100 steps

Limitations

Optimized for mathematical reasoning; may not perform as well on other domains
Trained on English language problems only
Best results with problems similar to training data format
Requires GPU for inference (4-bit quantization)

Citation

@misc{ernie45-math-2025,
  title={ERNIE-4.5 Fine-tuned for Mathematical Reasoning},
  author={naazimsnh02},
  year={2025},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/naazimsnh02/ernie-45-math-finetuned}}
}

Acknowledgments

ERNIE Team for the base model
Unsloth for optimization framework
NVIDIA for the Nemotron-RL dataset
Modal for GPU infrastructure
ERNIE AI Developer Challenge for the opportunity

License

MIT License - See repository for details

Trained with ❤️ using Unsloth and Modal