nielsr HF Staff commited on
Commit
43ab8ab
·
verified ·
1 Parent(s): c775a26

Add `library_name` to metadata

Browse files

This PR enhances the model card by adding `library_name: transformers` to the metadata.

This tag is justified by the `config.json` file, which specifies `"architectures": ["LlamaForCausalLM"]` and `"model_type": "llama"`. Llama-based models are typically integrated and used with the Hugging Face `transformers` library, enabling a predefined code snippet for users on the Hub.

No sample usage code snippet has been added as the provided GitHub README does not contain a suitable Python example for programmatic inference via a library.

Files changed (1) hide show
  1. README.md +11 -10
README.md CHANGED
@@ -2,6 +2,8 @@
2
  language:
3
  - zh
4
  - en
 
 
5
  tags:
6
  - llm
7
  - tts
@@ -9,8 +11,7 @@ tags:
9
  - voice-cloning
10
  - reinforcement-learning
11
  - flow-matching
12
- license: mit
13
- pipeline_tag: text-to-speech
14
  ---
15
 
16
  # GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS
@@ -35,12 +36,12 @@ By introducing a **Multi-Reward Reinforcement Learning** framework, GLM-TTS sign
35
 
36
  ### Key Features
37
 
38
- * **Zero-shot Voice Cloning:** Clone any speaker's voice with just 3-10 seconds of prompt audio.
39
- * **RL-enhanced Emotion Control:** Utilizes a multi-reward reinforcement learning framework (GRPO) to optimize prosody and emotion.
40
- * **High-quality Synthesis:** Generates speech comparable to commercial systems with reduced Character Error Rate (CER).
41
- * **Phoneme-level Control:** Supports "Hybrid Phoneme + Text" input for precise pronunciation control (e.g., polyphones).
42
- * **Streaming Inference:** Supports real-time audio generation suitable for interactive applications.
43
- * **Bilingual Support:** Optimized for Chinese and English mixed text.
44
 
45
  ## System Architecture
46
 
@@ -73,7 +74,7 @@ Evaluated on `seed-tts-eval`. **GLM-TTS_RL** achieves the lowest Character Error
73
  ### Installation
74
 
75
  ```bash
76
- git clone [https://github.com/zai-org/GLM-TTS.git](https://github.com/zai-org/GLM-TTS.git)
77
  cd GLM-TTS
78
  pip install -r requirements.txt
79
  ```
@@ -115,4 +116,4 @@ If you find GLM-TTS useful for your research, please cite our technical report:
115
  primaryClass={cs.SD},
116
  url={https://arxiv.org/abs/2512.14291},
117
  }
118
- }
 
2
  language:
3
  - zh
4
  - en
5
+ license: mit
6
+ pipeline_tag: text-to-speech
7
  tags:
8
  - llm
9
  - tts
 
11
  - voice-cloning
12
  - reinforcement-learning
13
  - flow-matching
14
+ library_name: transformers
 
15
  ---
16
 
17
  # GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS
 
36
 
37
  ### Key Features
38
 
39
+ * **Zero-shot Voice Cloning:** Clone any speaker's voice with just 3-10 seconds of prompt audio.
40
+ * **RL-enhanced Emotion Control:** Utilizes a multi-reward reinforcement learning framework (GRPO) to optimize prosody and emotion.
41
+ * **High-quality Synthesis:** Generates speech comparable to commercial systems with reduced Character Error Rate (CER).
42
+ * **Phoneme-level Control:** Supports "Hybrid Phoneme + Text" input for precise pronunciation control (e.g., polyphones).
43
+ * **Streaming Inference:** Supports real-time audio generation suitable for interactive applications.
44
+ * **Bilingual Support:** Optimized for Chinese and English mixed text.
45
 
46
  ## System Architecture
47
 
 
74
  ### Installation
75
 
76
  ```bash
77
+ git clone https://github.com/zai-org/GLM-TTS.git
78
  cd GLM-TTS
79
  pip install -r requirements.txt
80
  ```
 
116
  primaryClass={cs.SD},
117
  url={https://arxiv.org/abs/2512.14291},
118
  }
119
+ ```