LLaMA 2.7B Fine-tuned on Dolly
This is a LoRA adapter for LLaMA-2.7B, fine-tuned on the Databricks Dolly dataset for instruction-following tasks.
Model Details
- Base Model: LLaMA-2.7B
- Training Method: LoRA (Low-Rank Adaptation)
- Dataset: Databricks Dolly 15k
- Adapter Type: PEFT LoRA
LoRA Configuration
- Rank (r): 16
- Alpha: 32
- Dropout: 0.05
- Target Modules: Query and Value projection layers
- Trainable Parameters: ~8-16M (adapters only, <1% of base model)
Training Configuration
- Epochs: 5
- Batch Size: 4
- Learning Rate: 5e-04
- Gradient Accumulation: 1
- GPUs: 2
- Training Steps: 6810
- Optimizer: AdamW
- Weight Decay: 0.01
Usage
You need to install the required packages:
pip install transformers peft torch
Then load and use the model:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model (replace with actual 2.7B base model)
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-hf", # Update to 2.7B base if available
torch_dtype=torch.float16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly"
)
# Optional: Merge adapter for faster inference
# model = model.merge_and_unload()
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/llama-2.7b-fine-tuned-on-dolly")
# Generate
prompt = "Instruction: Write a short poem about AI.\n\nResponse:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_length=256,
temperature=0.7,
top_p=0.9,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Key Benefits
- Efficiency: Only ~8-16M trainable parameters (vs billions in full fine-tuning)
- Storage: Small adapter files (~30-60MB vs multi-GB full models)
- Modularity: Can swap adapters on the same base model
- Quality: Maintains competitive performance with full fine-tuning
Limitations
- Requires the base LLaMA model to use
- Performance depends on base model quality
- Trained primarily on English instruction-following tasks
- May generate biased or incorrect responses
Training Details
This model was fine-tuned using:
- PEFT/LoRA: Parameter-efficient fine-tuning
- Training Data: 15k instruction-response pairs from Dolly
- Task: General instruction following and question answering
- Learning Rate Schedule: Cosine decay with warmup
Citation
@inproceedings{lora,
title={LoRA: Low-Rank Adaptation of Large Language Models},
author={Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
booktitle={International Conference on Learning Representations},
year={2022}
}
License
This model is released under Apache 2.0 license. Note that LLaMA models have specific usage terms from Meta.
- Downloads last month
- 20
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for minhchuxuan/llama-2.7b-dolly-lora
Base model
meta-llama/Llama-2-7b-hf