Luka512 commited on
Commit
4ec5b2f
·
verified ·
1 Parent(s): c2dafac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -1
README.md CHANGED
@@ -5,10 +5,95 @@ language:
5
  - de
6
  - fr
7
  - zh
 
 
8
  base_model:
9
  - FunAudioLLM/CosyVoice2-0.5B
10
  - Qwen/Qwen3-0.6B
11
  - utter-project/EuroLLM-1.7B-Instruct
12
  - mistralai/Mistral-7B-v0.3
13
  pipeline_tag: text-to-speech
14
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - de
6
  - fr
7
  - zh
8
+ - ko
9
+ - ja
10
  base_model:
11
  - FunAudioLLM/CosyVoice2-0.5B
12
  - Qwen/Qwen3-0.6B
13
  - utter-project/EuroLLM-1.7B-Instruct
14
  - mistralai/Mistral-7B-v0.3
15
  pipeline_tag: text-to-speech
16
+ ---
17
+
18
+
19
+ <p align="center">
20
+ <img src="https://horstmann.tech/cosyvoice2-demo/cosyvoice2-logo-clear.png" alt="CosyVoice2-EU logo" width="260">
21
+ </p>
22
+
23
+ # CosyVoice2-0.5B-EU — FR/DE Zero-Shot Voice Cloning (CosyVoice2)
24
+
25
+ **Europeanized CosyVoice2 for French & German.**
26
+ Plug-and-play zero-shot voice cloning with streaming support, bilingual training (FR+DE), and a simple CLI via the companion PyPI package.
27
+
28
+ **👉 PyPI:** `cosyvoice2-eu` (current: **0.2.7**) at https://pypi.org/project/cosyvoice2-eu/
29
+ **👉 Demo:** https://horstmann.tech/cosyvoice2-demo/
30
+ **👉 Built on:** FunAudioLLM **CosyVoice2** (semantic LM + chunk-aware flow + HiFi-GAN)
31
+
32
+ ---
33
+
34
+ ## TL;DR
35
+ High-quality **French/German** zero-shot TTS (text + short reference audio) built on **CosyVoice2**. Optimized for sentence-to-paragraph narration, bilingual FR+DE adaptation, and easy local inference.
36
+ While this model is optimized for French and German, it remains fully compatible with the original CosyVoice2 languages — English, Chinese, Japanese, Korean, and their dialects.
37
+
38
+ ---
39
+
40
+ ## Quickstart (CLI)
41
+
42
+ Install:
43
+ ```bash
44
+ pip install cosyvoice2-eu
45
+ ```
46
+
47
+ French example:
48
+ ```bash
49
+ cosy2-eu --text "Salut ! Je vous présente CosyVoice 2, un système de synthèse vocale très avancé." --prompt path/to/french_ref.wav --out out_fr.wav
50
+ ```
51
+
52
+ German example:
53
+ ```bash
54
+ cosy2-eu --text "Hallo! Ich präsentiere CosyVoice 2 – ein fortschrittliches TTS-System." --prompt path/to/german_ref.wav --out out_de.wav
55
+ ```
56
+
57
+ > First run downloads the model from this repo and caches it locally.
58
+ > Tip: You can experiment with prompts for style control using `"<style>. <|endofprompt|> <text>"`, e.g., "Speak cheerfully. <|endofprompt|> Hallo! Wie geht es Ihnen heute?"
59
+
60
+ ---
61
+
62
+ ## What you get
63
+ - **Zero-shot voice cloning** for **FR/DE** (reference audio → cloned timbre & style).
64
+ - **Bilingual adaptation** (FR+DE) on top of CosyVoice2 for stronger data efficiency. While this model adds support for French and German, it remains fully compatible with the original CosyVoice2 languages — English, Chinese, Japanese, Korean, and their dialects.
65
+ - **Streaming & non-streaming** synthesis supported by the underlying architecture.
66
+ - **Simple local inference**: one pip install, one CLI (`cosy2-eu`).
67
+ - **Interoperable components** (text→semantic LM, flow decoder, HiFi-GAN vocoder).
68
+
69
+ Also compatible with original CosyVoice2 languages (EN/ZH/JA/KO & dialects).
70
+
71
+ ---
72
+
73
+ ## Inputs / Outputs
74
+ - **Input:** text (FR/DE) + short **reference audio** (mono WAV recommended).
75
+ - **Output:** synthesized WAV cloning the reference speaker’s timbre, speaking the input text in FR/DE.
76
+
77
+ ---
78
+
79
+ ## Notes & limitations
80
+ - FR/DE were adapted under constrained open-data budgets; extreme edge cases (very noisy prompts, long numerics, heavy code-switching) may require careful prompting or additional fine-tuning.
81
+ - Voice cloning carries **misuse risks** (impersonation, fraud). Use only with consent and follow local laws/policies.
82
+
83
+ ---
84
+
85
+ ## License & attribution
86
+ - **License:** Apache-2.0 (see card metadata / repo).
87
+ - Built on **CosyVoice2** by FunAudioLLM; please cite their work (see below).
88
+
89
+
90
+ ---
91
+
92
+ **Links**
93
+ - PyPI (inference CLI): https://pypi.org/project/cosyvoice2-eu/
94
+ - Upstream project: https://github.com/FunAudioLLM/CosyVoice
95
+ - CosyVoice2 paper & page: https://arxiv.org/abs/2412.10117 • https://funaudiollm.github.io/cosyvoice2/
96
+
97
+ ---
98
+
99
+ *If you use CosyVoice2-0.5B-EU in research or products, please add a short acknowledgment and share feedback or samples—we’re continuously improving FR/DE expressiveness and robustness.*