---
library_name: transformers
tags:
  - llama-3.2
  - causal-lm
  - code
  - python
  - peft
  - qlora
---

# Model Card for llama32-1b-python-docstrings-qlora

A parameter-efficiently fine-tuned adapter on top of `meta-llama/Llama-3.2-1B-Instruct` for generating concise one-line Python docstrings from function bodies.

## Model Details

### Model Description

- **Developed by:** Abdullah Al-Housni
- **Model type:** Causal language model with LoRA/QLoRA adapters
- **Language(s):** Python code as input, English docstrings as output
- **License:** Same as `meta-llama/Llama-3.2-1B-Instruct` (Meta Llama 3.2 Community License)
- **Finetuned from model:** `meta-llama/Llama-3.2-1B-Instruct`

The model is trained to take a Python function definition and generate a concise, one-line docstring describing what the function does.

## Uses

### Direct Use

- Automatically generate one-line Python docstrings for functions.
- Improve or bootstrap documentation in Python codebases.
- Educational use for learning how to summarize code behavior.

Typical usage pattern:
- Input: Python function body (source code).
- Output: Single-sentence English description suitable as a docstring.

### Out-of-Scope Use

- Generating full, multi-paragraph API documentation.
- Security auditing or correctness guarantees for code.
- Use outside Python (e.g., other programming languages) without additional fine-tuning.
- Any safety-critical application where incorrect summaries could cause harm.

## Bias, Risks, and Limitations

- The model can produce **incorrect or incomplete summaries**, especially for complex or ambiguous functions.
- It may imitate noisy or low-quality patterns from the training data (e.g., overly short or cryptic docstrings).
- It does **not** understand project-specific context, invariants, or business logic; outputs should be reviewed by a human developer.

### Recommendations

- Use the model as an **assistive tool**, not an authoritative source.
- Always review and edit generated docstrings before committing to production code.
- For non-Python or highly domain-specific code, consider additional fine-tuning on in-domain examples.

## How to Get Started with the Model

Example with 🤗 Transformers and PEFT (LoRA adapter):

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model_id = "meta-llama/Llama-3.2-1B-Instruct"
adapter_id = "Abdul1102/llama32-1b-python-docstrings-qlora"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(base_model_id, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)

def make_prompt(code: str) -> str:
    return
        f'Write a one-line Python docstring for this function:\n\n{code}\n\n"""'

code = "def add(a, b):\n    return a + b"
inputs = tokenizer(make_prompt(code), return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=32, do_sample=False)
text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(text)
```

## Training Details

### Training Data

- Dataset: Python subset of CodeSearchNet (`Nan-Do/code-search-net-python`)
- Inputs: `code` column (full Python function body)
- Targets: First non-empty line of `docstring`
- A filtered subset of ~1,000–2,000 examples was used for efficient QLoRA fine-tuning

### Training Procedure

- Objective: Causal language modeling (predict the docstring continuation)
- Method: QLoRA (4-bit quantized base model with LoRA adapters)
- Precision: 4-bit quantized weights, bf16 compute
- Epochs: 1
- Max sequence length: 256–512 tokens

#### Training Hyperparameters

- Learning rate: ~2e-4 (adapter weights only)
- Epochs: 1
- Optimizer: AdamW via Hugging Face `Trainer`
- LoRA rank: 16
- LoRA alpha: 32
- LoRA dropout: 0.05

---

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

Held-out test split from the same CodeSearchNet Python dataset, using identical `code` → one-line docstring mapping.

#### Factors

- Function size and complexity  
- Variety in docstring writing styles  
- Presence of short or noisy docstrings  

#### Metrics

- BLEU (sacreBLEU): strict n-gram overlap, sensitive to paraphrasing  
- ROUGE (ROUGE-1 / ROUGE-2 / ROUGE-L): better for short summaries  

### Results

Approximate performance on ~50 held-out samples:

- BLEU: ~12.4  
- ROUGE-1: ~0.78  
- ROUGE-2: ~0.74  
- ROUGE-L: ~0.78  

#### Summary

The model frequently reproduces or closely paraphrases the correct docstring. Occasional failures include echoing part of the prompt or returning an empty string. Strong performance for a 1B model trained briefly on a small dataset.

---

## Model Examination

Not applicable.

---

## Environmental Impact

- Hardware Type: Google Colab GPU (T4/L4)  
- Hours Used: ~0.5–1 hour total  
- Cloud Provider: Google Colab  
- Compute Region: US  
- Carbon Emitted: Not estimated (very low due to minimal training time)

---

## Technical Specifications

### Model Architecture and Objective

- Base model: Llama 3.2 1B Instruct  
- Architecture: Decoder-only transformer  
- Objective: Causal language modeling  
- Parameter-efficient fine-tuning using LoRA (rank 16)

### Compute Infrastructure

#### Hardware

Single Google Colab GPU (T4 or L4)

#### Software

- Python  
- PyTorch  
- Hugging Face Transformers  
- PEFT  
- bitsandbytes  
- Datasets  

---

## Citation

Not applicable.

---

## Glossary

Not applicable.

---

## More Information

See the Hugging Face model page for updates or usage examples.

---

## Model Card Authors

Abdullah Al-Housni

---

## Model Card Contact

Available through the Hugging Face model repository.