vibevoice-community
/

Cortana

Model card Files Files and versions

Cortana / README.md

Peter W

Update README.md

b494f20 verified 24 days ago

|

history blame contribute delete

1.17 kB

	---
	license: apache-2.0
	---

	This is a fine-tune of the 7B model of VibeVoice. Requires 21.9GB of VRAM for inference (through OpenVoiceLab)

	<br>
	Fine-tuning was done using the code available here: <br>
	https://github.com/voicepowered-ai/VibeVoice-finetuning


	<br><br>
	Dataset used for fine-tuning, was 764 audio files available for Halo 1, Halo 2, and Halo 3, via the links below: <br>
	Halo 1: https://sounds.spriters-resource.com/xbox/halocombatevolved/asset/413569/ <br>
	Halo 2: https://sounds.spriters-resource.com/xbox/halo2/asset/436393/?source=genre <br>
	Halo 3: https://sounds.spriters-resource.com/xbox_360/halo3/asset/405404/ <br>



	<br><br>
	Fine-tuning parameters used were:

	batch_size = 1 <br>
	drop_rate = 0.2 <br>
	grad_accum = 1 <br>
	lr = 2.5e-5 <br>
	lora_r = 128 <br>
	lora_alpha = 512 <br>
	epochs = 20 <br>
	train_diff = True <br>
	bf16 = True <br>
	grad_clip = True <br>
	max_grad = 0.8 <br>
	grad_checkpoint = False <br>
	diff_weight = 1.4 <br>
	ce_weight = 0.04 <br>
	warmup = 0.03 <br>
	scheduler = "cosine" <br>

	<br>
	Special thanks to mrfakename, for creating OpenVoiceLab, a fantastic resource for both inference, and fine-tuning, with quite a nice GUI.