microsoft_DialoGPT-small_databricks-dolly-15k_sft v1.0

Model Description

Hello

Base Model: microsoft/DialoGPT-small

Developed by: Mathhead

Training Details

Dataset

  • Training Data: {training_dataset}
  • Dataset Size: {dataset_size}
  • Training Duration: {training_duration}
  • Hardware: {hardware}

Hyperparameters

  • max_length: 512
  • num_epochs: 1
  • batch_size: 4
  • eval_batch_size: 4
  • gradient_accumulation_steps: 4
  • warmup_steps: 100
  • learning_rate: 5e-05
  • weight_decay: 0.01
  • logging_steps: 10
  • eval_steps: 500
  • save_steps: 1000
  • save_total_limit: 3
  • report_to: ['wandb']

Evaluation Metrics

  • eval_loss: 8.534549713134766
  • eval_runtime: 12.9653
  • eval_samples_per_second: 7.713
  • eval_steps_per_second: 1.928
  • epoch: 1.0

Test Dataset: {test_dataset}

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft")
model = AutoModelForCausalLM.from_pretrained("wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft")

# Generate text
prompt = "{example_prompt}"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens={max_new_tokens},
    temperature={temperature},
    top_p={top_p}
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Limitations and Biases

{limitations_list}

Citation

@misc{{{citation_key},
  author = {{{citation_authors}}},
  title = {{{citation_title}}},
  year = {{{citation_year}}},
  publisher = {{Hugging Face}},
  url = {{https://huggingface.co/wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft}}
}}

Model Card Authors

Mathhead

Model Card Contact

For questions and feedback, please contact: {contact_email}


This model card was generated from template on 2025-11-15.

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for wxkkxw/microsoft_DialoGPT-small_databricks-dolly-15k_sft

Adapter
(20)
this model