microsoft_DialoGPT-small_databricks-dolly-15k_sft v1.0

Model Description

Hello

Base Model: microsoft/DialoGPT-small

Developed by: Mathhead

Training Details

Dataset

Training Data: {training_dataset}
Dataset Size: {dataset_size}
Training Duration: {training_duration}
Hardware: {hardware}

Hyperparameters

max_length: 512
num_epochs: 1
batch_size: 4
eval_batch_size: 4
gradient_accumulation_steps: 4
warmup_steps: 100
learning_rate: 5e-05
weight_decay: 0.01
logging_steps: 10
eval_steps: 500
save_steps: 1000
save_total_limit: 3
report_to: ['wandb']

Evaluation Metrics

eval_loss: 8.534549713134766
eval_runtime: 12.9653
eval_samples_per_second: 7.713
eval_steps_per_second: 1.928
epoch: 1.0

Test Dataset: {test_dataset}

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft")
model = AutoModelForCausalLM.from_pretrained("wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft")

# Generate text
prompt = "{example_prompt}"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens={max_new_tokens},
    temperature={temperature},
    top_p={top_p}
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Limitations and Biases

{limitations_list}

Citation

@misc{{{citation_key},
  author = {{{citation_authors}}},
  title = {{{citation_title}}},
  year = {{{citation_year}}},
  publisher = {{Hugging Face}},
  url = {{https://huggingface.co/wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft}}
}}

Model Card Authors

Mathhead

Model Card Contact

For questions and feedback, please contact: {contact_email}

This model card was generated from template on 2025-11-15.

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft

Base model

microsoft/DialoGPT-small

Adapter

(20)

this model