Luka512
/

CosyVoice2-0.5B-EU

Text-to-Speech

ONNX

Safetensors

Model card Files Files and versions

xet

Community

Luka512 commited on Sep 3

Commit

4ec5b2f

verified ·

1 Parent(s): c2dafac

Update README.md

Browse files

Files changed (1) hide show

README.md +86 -1

README.md CHANGED Viewed

@@ -5,10 +5,95 @@ language:
 - de
 - fr
 - zh
 base_model:
 - FunAudioLLM/CosyVoice2-0.5B
 - Qwen/Qwen3-0.6B
 - utter-project/EuroLLM-1.7B-Instruct
 - mistralai/Mistral-7B-v0.3
 pipeline_tag: text-to-speech
----

 - de
 - fr
 - zh
+- ko
+- ja
 base_model:
 - FunAudioLLM/CosyVoice2-0.5B
 - Qwen/Qwen3-0.6B
 - utter-project/EuroLLM-1.7B-Instruct
 - mistralai/Mistral-7B-v0.3
 pipeline_tag: text-to-speech
+---
+<p align="center">
+  <img src="https://horstmann.tech/cosyvoice2-demo/cosyvoice2-logo-clear.png" alt="CosyVoice2-EU logo" width="260">
+</p>
+# CosyVoice2-0.5B-EU — FR/DE Zero-Shot Voice Cloning (CosyVoice2)
+**Europeanized CosyVoice2 for French & German.**
+Plug-and-play zero-shot voice cloning with streaming support, bilingual training (FR+DE), and a simple CLI via the companion PyPI package.
+**👉 PyPI:** `cosyvoice2-eu` (current: **0.2.7**) at https://pypi.org/project/cosyvoice2-eu/
+**👉 Demo:** https://horstmann.tech/cosyvoice2-demo/
+**👉 Built on:** FunAudioLLM **CosyVoice2** (semantic LM + chunk-aware flow + HiFi-GAN)
+---
+## TL;DR
+High-quality **French/German** zero-shot TTS (text + short reference audio) built on **CosyVoice2**. Optimized for sentence-to-paragraph narration, bilingual FR+DE adaptation, and easy local inference.
+While this model is optimized for French and German, it remains fully compatible with the original CosyVoice2 languages — English, Chinese, Japanese, Korean, and their dialects.
+---
+## Quickstart (CLI)
+Install:
+```bash
+pip install cosyvoice2-eu
+```
+French example:
+```bash
+cosy2-eu   --text "Salut ! Je vous présente CosyVoice 2, un système de synthèse vocale très avancé."   --prompt path/to/french_ref.wav   --out out_fr.wav
+```
+German example:
+```bash
+cosy2-eu   --text "Hallo! Ich präsentiere CosyVoice 2 – ein fortschrittliches TTS-System."   --prompt path/to/german_ref.wav   --out out_de.wav
+```
+> First run downloads the model from this repo and caches it locally.
+> Tip: You can experiment with prompts for style control using `"<style>. <|endofprompt|> <text>"`, e.g., "Speak cheerfully. <|endofprompt|>  Hallo! Wie geht es Ihnen heute?"
+---
+## What you get
+- **Zero-shot voice cloning** for **FR/DE** (reference audio → cloned timbre & style).
+- **Bilingual adaptation** (FR+DE) on top of CosyVoice2 for stronger data efficiency. While this model adds support for French and German, it remains fully compatible with the original CosyVoice2 languages — English, Chinese, Japanese, Korean, and their dialects.
+- **Streaming & non-streaming** synthesis supported by the underlying architecture.
+- **Simple local inference**: one pip install, one CLI (`cosy2-eu`).
+- **Interoperable components** (text→semantic LM, flow decoder, HiFi-GAN vocoder).
+Also compatible with original CosyVoice2 languages (EN/ZH/JA/KO & dialects).
+---
+## Inputs / Outputs
+- **Input:** text (FR/DE) + short **reference audio** (mono WAV recommended).
+- **Output:** synthesized WAV cloning the reference speaker’s timbre, speaking the input text in FR/DE.
+---
+## Notes & limitations
+- FR/DE were adapted under constrained open-data budgets; extreme edge cases (very noisy prompts, long numerics, heavy code-switching) may require careful prompting or additional fine-tuning.
+- Voice cloning carries **misuse risks** (impersonation, fraud). Use only with consent and follow local laws/policies.
+---
+## License & attribution
+- **License:** Apache-2.0 (see card metadata / repo).
+- Built on **CosyVoice2** by FunAudioLLM; please cite their work (see below).
+---
+**Links**
+- PyPI (inference CLI): https://pypi.org/project/cosyvoice2-eu/
+- Upstream project: https://github.com/FunAudioLLM/CosyVoice
+- CosyVoice2 paper & page: https://arxiv.org/abs/2412.10117 • https://funaudiollm.github.io/cosyvoice2/
+---
+*If you use CosyVoice2-0.5B-EU in research or products, please add a short acknowledgment and share feedback or samples—we’re continuously improving FR/DE expressiveness and robustness.*