stepfun-ai
/

Step-Audio-EditX

text-generation

Model card Files Files and versions

yanchaomars commited on 8 days ago

Commit

7f3de60

·

verified ·

1 Parent(s): 4fe4b65

update news

update new models in news

Files changed (1) hide show

README.md +7 -2

README.md CHANGED Viewed

@@ -11,13 +11,18 @@ library_name: transformers
 Check our open-source repository https://github.com/stepfun-ai/Step-Audio-EditX for more details!
 We are open-sourcing **Step-Audio-EditX**, a powerful **3B parameters** LLM-based audio model specialized in expressive and **iterative audio editing**.
 It excels at **editing emotion**, **speaking style**, and **paralinguistics**, and also features robust **zero-shot text-to-speech (TTS)** capabilities.
 ## Features
 - **Zero-Shot TTS**
-  - Excellent zero-shot TTS cloning for Mandarin, English, Sichuanese, and Cantonese.
-  - To use a dialect, just add a **[Sichuanese]** or **[Cantonese]** tag before your text.
 - **Emotion and Speaking Style Editing**
   - Remarkably effective iterative control over emotions and styles, supporting **dozens** of options for editing.

 Check our open-source repository https://github.com/stepfun-ai/Step-Audio-EditX for more details!
+## 🔥🔥🔥 News!!！
+* Nov 28, 2025: 🚀 New Model Release: Now supporting **`Japanese`** and **`Korean`** languages.
+* Nov 23, 2025: 📊 [Step-Audio-Edit-Benchmark](https://github.com/stepfun-ai/Step-Audio-Edit-Benchmark) Released!
+* Nov 19, 2025: ⚙️ We release a **new version** of our model, which **supports polyphonic pronunciation control** and improves the performance of emotion, speaking style, and paralinguistic editing.
 We are open-sourcing **Step-Audio-EditX**, a powerful **3B parameters** LLM-based audio model specialized in expressive and **iterative audio editing**.
 It excels at **editing emotion**, **speaking style**, and **paralinguistics**, and also features robust **zero-shot text-to-speech (TTS)** capabilities.
 ## Features
 - **Zero-Shot TTS**
+  - Excellent zero-shot TTS cloning for `Mandarin`, `English`, `Sichuanese`, `Cantonese`, `Japanese` and `Korean`.
+  - To use a dialect, just add a **`[Sichuanese]`**, **`[Cantonese]`** ,**`[Japanese]`**,**`[Korean]`** tag before your text.
 - **Emotion and Speaking Style Editing**
   - Remarkably effective iterative control over emotions and styles, supporting **dozens** of options for editing.