Update the README about the different synthesizers.
Browse files
app.py
CHANGED
|
@@ -242,12 +242,19 @@ type=['wav'])
|
|
| 242 |
with about:
|
| 243 |
#st.header("How it works")
|
| 244 |
st.markdown('''# Mockingbird TTS Demo
|
| 245 |
-
This page is a demo of the openly available Text to Speech models for various languages of interest. Currently,
|
| 246 |
- [**Meta's Massively Multilingual Speech (MMS)**](https://ai.meta.com/blog/multilingual-model-speech-recognition/) model, which supports over 1000 languages.[^1]
|
| 247 |
-
- [**Coqui's TTS**](https://docs.coqui.ai/en/latest/#) package;[^2] while no longer supported, Coqui acted as a hub for TTS model hosting and these models are still available.
|
| 248 |
-
- [**ESpeak-NG's**](https://github.com/espeak-ng/espeak-ng/tree/master)'s synthetic voices**[^3]
|
| 249 |
- [**IMS Toucan**](https://github.com/DigitalPhonetics/IMS-Toucan), which supports 7000 languages.[^4]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 250 |
- [**Piper**](https://github.com/rhasspy/piper), a TTS system that supports multiple voices per language and approximately 30 languages.[^5]
|
|
|
|
|
|
|
| 251 |
|
| 252 |
Voice conversion is currently achieved through Coqui.
|
| 253 |
|
|
@@ -268,6 +275,7 @@ Notes:
|
|
| 268 |
[^3]: [Language list](https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md)
|
| 269 |
[^4]: Language list is available in the Gradio API documentation [here](https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS).
|
| 270 |
[^5]: The list of available voices is [here](https://github.com/rhasspy/piper/blob/master/VOICES.md), model checkpoints are [here](https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main), and they can be tested [here](https://rhasspy.github.io/piper-samples/).
|
|
|
|
| 271 |
''')
|
| 272 |
|
| 273 |
|
|
|
|
| 242 |
with about:
|
| 243 |
#st.header("How it works")
|
| 244 |
st.markdown('''# Mockingbird TTS Demo
|
| 245 |
+
This page is a demo of the openly available Text to Speech models for various languages of interest. Currently, 3 synthesizers with multilingual offerings are supported out of the box:
|
| 246 |
- [**Meta's Massively Multilingual Speech (MMS)**](https://ai.meta.com/blog/multilingual-model-speech-recognition/) model, which supports over 1000 languages.[^1]
|
|
|
|
|
|
|
| 247 |
- [**IMS Toucan**](https://github.com/DigitalPhonetics/IMS-Toucan), which supports 7000 languages.[^4]
|
| 248 |
+
- [**ESpeak-NG's**](https://github.com/espeak-ng/espeak-ng/tree/master)'s synthetic voices**[^3]
|
| 249 |
+
|
| 250 |
+
On a case-by-case basis, for different languages of interest, I have added:
|
| 251 |
+
- [**Coqui's TTS**](https://docs.coqui.ai/en/latest/#) package;[^2] while no longer supported, Coqui acted as a hub for TTS model hosting and these models are still available. Languages must be added on a model-by-model basis.
|
| 252 |
+
- Specific fine-tuned variants of Meta's MMS (either fine-tuned by [Yoach Lacombe](https://huggingface.co/ylacombe), or fine-tuned by me using his scripts).
|
| 253 |
+
|
| 254 |
+
I am in the process of adding support for:
|
| 255 |
- [**Piper**](https://github.com/rhasspy/piper), a TTS system that supports multiple voices per language and approximately 30 languages.[^5]
|
| 256 |
+
- [**African Voices**](https://github.com/neulab/AfricanVoices), a CMU research project that fine-tuned synthesizers for different African languages. The site hosting the synthesizers is deprecated but they can be downloaded from Google's Wayback Machine. [^6]
|
| 257 |
+
|
| 258 |
|
| 259 |
Voice conversion is currently achieved through Coqui.
|
| 260 |
|
|
|
|
| 275 |
[^3]: [Language list](https://github.com/espeak-ng/espeak-ng/blob/master/docs/languages.md)
|
| 276 |
[^4]: Language list is available in the Gradio API documentation [here](https://huggingface.co/spaces/Flux9665/MassivelyMultilingualTTS).
|
| 277 |
[^5]: The list of available voices is [here](https://github.com/rhasspy/piper/blob/master/VOICES.md), model checkpoints are [here](https://huggingface.co/datasets/rhasspy/piper-checkpoints/tree/main), and they can be tested [here](https://rhasspy.github.io/piper-samples/).
|
| 278 |
+
[^6]:
|
| 279 |
''')
|
| 280 |
|
| 281 |
|