Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

cygu-llama-2-7b-sampling-watermark-distill-kth-shift256-ft-OpenMathInstruct-lora

This model is a fine-tuned version of cygu/llama-2-7b-sampling-watermark-distill-kth-shift256 on an unknown dataset.

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
training_steps: 2500

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Adapter

(2)

this model