RealVideo / README.md
ZHANGYUXUAN-zR's picture
Update README.md
1447f2d verified
---
license: mit
language:
- en
- zh
base_model:
- Wan-AI/Wan2.2-S2V-14B
pipeline_tag: any-to-any
---
# RealVideo
RealVideo is a WebSocket-based video calling system that supports text input. It leverages **GLM-4.5-AirX** and
**GLM-TTS** models to generate audio responses and utilizes autoregressive diffusion to generate corresponding video frames. The
system features a modular design with full functionality and a clean code structure.
Visit [blog](https://z.ai/blog/realvideo) here!
## Features
- **Text Input**: Supports text message input.
- **AI Voice Response**: Integrates GLM-4.5-AirX and GLM-TTS models to generate voice responses.
- **Lip Sync**: Generates real-time conversational video based on any input image and audio.
- **Real-time Communication**: WebSocket-based real-time bidirectional communication.
## Quick Start
you can check in our [GitHub](https://github.com/zai-org/RealVideo).
## Technical Highlights
- **Model Integration**: Allows for convenient and quick voice cloning, taking text input to generate audio output.
- **Modular Design**: Clear code structure, easy to maintain and extend.
- **Real-time Performance**: Optimized audio processing and real-time video generation algorithms.
## Acknowledgements
This project utilizes the following open-source libraries:
- [self forcing](https://github.com/guandeh17/Self-Forcing)