RVC-MLX Pretrained Weights
MLX-compatible pretrained weights for RVC (Retrieval-based Voice Conversion), converted for use with rvc-mlx.
These weights enable high-quality voice conversion on Apple Silicon Macs using the MLX framework.
Available Models
| File | Sample Rate | Size | Description |
|---|---|---|---|
v2/f0G48k.safetensors |
48 kHz | 110 MB | V2 with F0 (pitch) - highest quality |
v2/f0G40k.safetensors |
40 kHz | 105 MB | V2 with F0 (pitch) |
v2/f0G32k.safetensors |
32 kHz | 107 MB | V2 with F0 (pitch) |
All models use:
- Architecture: SynthesizerTrnMs768NSFsid
- Input: 768-dim ContentVec features
- F0 Support: Yes (pitch-aware synthesis)
Quick Start
from huggingface_hub import hf_hub_download
# Download the 48kHz model
weights_path = hf_hub_download(
repo_id="lexandstuff/rvc-mlx-weights",
filename="v2/f0G48k.safetensors"
)
# Download config
config_path = hf_hub_download(
repo_id="lexandstuff/rvc-mlx-weights",
filename="v2/config.json"
)
Usage with rvc-mlx
import json
from safetensors.numpy import load_file
from rvc_mlx.models import SynthesizerTrnMs768NSFsid
# Load config
with open(config_path) as f:
configs = json.load(f)
config = configs["48000"] # or "40000", "32000"
# Create model
model = SynthesizerTrnMs768NSFsid(**config)
# Load weights
weights = load_file(weights_path)
# ... load weights into model
Model Details
These are inference-only weights - training components (posterior encoder) have been removed to reduce file size.
Architecture
SynthesizerTrnMs768NSFsid
βββ enc_p (TextEncoder) - Encodes ContentVec + pitch
βββ flow (ResidualCoupling) - Normalizing flow for voice conversion
βββ dec (GeneratorNSF) - HiFi-GAN vocoder with neural source filter
βββ emb_g (Embedding) - Speaker embedding
Upsampling Rates
| Sample Rate | Upsample Rates | Total Factor |
|---|---|---|
| 32 kHz | [10, 8, 2, 2] | 320x |
| 40 kHz | [10, 10, 2, 2] | 400x |
| 48 kHz | [12, 10, 2, 2] | 480x |
Original Source
These weights are converted from the official RVC pretrained models:
- Source: lj1995/VoiceConversionWebUI
- Files:
pretrained_v2/f0G{32k,40k,48k}.pth
License
MIT License - same as the original RVC project.
Citation
If you use these weights, please cite the original RVC project:
@software{rvc2023,
author = {RVC-Project},
title = {Retrieval-based-Voice-Conversion-WebUI},
year = {2023},
url = {https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support