Safetensors
Cortana / README.md
Peter W
Update README.md
b494f20 verified
---
license: apache-2.0
---
This is a fine-tune of the 7B model of VibeVoice. Requires 21.9GB of VRAM for inference (through OpenVoiceLab)
<br>
Fine-tuning was done using the code available here: <br>
https://github.com/voicepowered-ai/VibeVoice-finetuning
<br><br>
Dataset used for fine-tuning, was 764 audio files available for Halo 1, Halo 2, and Halo 3, via the links below: <br>
Halo 1: https://sounds.spriters-resource.com/xbox/halocombatevolved/asset/413569/ <br>
Halo 2: https://sounds.spriters-resource.com/xbox/halo2/asset/436393/?source=genre <br>
Halo 3: https://sounds.spriters-resource.com/xbox_360/halo3/asset/405404/ <br>
<br><br>
Fine-tuning parameters used were:
batch_size = 1 <br>
drop_rate = 0.2 <br>
grad_accum = 1 <br>
lr = 2.5e-5 <br>
lora_r = 128 <br>
lora_alpha = 512 <br>
epochs = 20 <br>
train_diff = True <br>
bf16 = True <br>
grad_clip = True <br>
max_grad = 0.8 <br>
grad_checkpoint = False <br>
diff_weight = 1.4 <br>
ce_weight = 0.04 <br>
warmup = 0.03 <br>
scheduler = "cosine" <br>
<br>
Special thanks to mrfakename, for creating OpenVoiceLab, a fantastic resource for both inference, and fine-tuning, with quite a nice GUI.