SoulX-Singer / README.md
drbaph's picture
Update README.md
0d12ed4 verified
---
language:
- en
- zh
library_name: huggingface_hub
license: apache-2.0
pipeline_tag: text-to-speech
tags:
- text-to-audio
- music
- singing-voice-synthesis
- svs
- zero-shot
---
## ComfyUI Custom Node
This repository includes a custom node for ComfyUI integration:
🔗 **[ComfyUI-SoulX-Singer](https://github.com/Saganaki22/ComfyUI-SoulX-Singer)**
![Screenshot 2026-02-11 160905](https://cdn-uploads.huggingface.co/production/uploads/63473b59e5c0717e6737b872/FqxVnkFDrVt287ppwQj90.png)
Use this custom node to integrate SoulX-Singer into your ComfyUI workflows for seamless singing voice synthesis.
# SoulX-Singer: Converted .pt model to .safetensors
**bf16 + fp32**
## Audio Samples
### Original Audio
<audio controls>
<source src="https://huggingface.co/drbaph/SoulX-Singer/resolve/main/samples/song.mp3" type="audio/mpeg">
Your browser does not support the audio element.
</audio>
### SpongeBob Voice
<audio controls>
<source src="https://huggingface.co/drbaph/SoulX-Singer/resolve/main/samples/generated/sample-1.mp3" type="audio/mpeg">
Your browser does not support the audio element.
</audio>
### Male Voice
<audio controls>
<source src="https://huggingface.co/drbaph/SoulX-Singer/resolve/main/samples/generated/sample-2.mp3" type="audio/mpeg">
Your browser does not support the audio element.
</audio>
---
<div align="center">
<b><em>Towards High-Quality Zero-Shot Singing Voice Synthesis</em></b>
<p>
<img src="assets/soulx-logo.png" alt="SoulX-Singer_Logo" style="height: 80px;">
</p>
<p>
<a href="https://soul-ailab.github.io/soulx-singer/"><img src="https://img.shields.io/badge/Demo-Page-lightgrey" alt="version"></a>
<a href="https://github.com/Soul-AILab/SoulX-Singer"><img src='https://img.shields.io/badge/Github-Page-green' alt="Github"></a>
<a href="https://arxiv.org/abs/2602.07803"><img src="https://img.shields.io/badge/arXiv-2602.07803-b31b1b" alt="arXiv"></a>
<a href="https://github.com/Soul-AILab/SoulX-Singer/blob/main/assets/technical-report.pdf"><img src='https://img.shields.io/badge/Report-Github?label=Technical&color=red' alt="technical report"></a>
<a href="https://github.com/Soul-AILab/SoulX-Singer"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="Apache-2.0"></a>
</p>
</div>
---
## Overview
**SoulX-Singer** is a high-fidelity, zero-shot singing voice synthesis model that enables users to generate realistic singing voices for unseen singers. It supports melody-conditioned (F0 contour) and score-conditioned (MIDI notes) control for precise pitch, rhythm, and expression.
For more details, please refer to the paper: [SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis](https://arxiv.org/abs/2602.07803).
---
## Features
- **Zero-shot synthesis**: Generate singing voices for unseen singers without fine-tuning
- **Melody-conditioned control**: Use F0 contour for pitch guidance
- **Score-conditioned control**: Use MIDI notes for precise musical notation
- **High-fidelity output**: Realistic vocal synthesis with natural expression
- **Safetensors format**: Optimized model weights in bf16 + fp32 precision
---
## Citation
If you use SoulX-Singer in your research, please cite:
```bibtex
@article{soulxsinger2025,
title={SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis},
author={Soul-AILab},
journal={arXiv preprint arXiv:2602.07803},
year={2025}
}
```
---
## License
This project is licensed under the Apache License 2.0.