AJ50 commited on
Commit
c0bd6fb
·
verified ·
1 Parent(s): 62d7b3c

Add vocoder model with config and documentation

Browse files
Files changed (3) hide show
  1. Readme.md +27 -0
  2. config.json +26 -0
  3. vocoder.pt +3 -0
Readme.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Vocoder Model
2
+
3
+ This directory contains the pre-trained vocoder model for converting mel-spectrograms to audio waveforms.
4
+
5
+ ## Model Details
6
+ - **File**: `vocoder.pt`
7
+ - **Input**: Mel-spectrograms
8
+ - **Output**: Audio waveform
9
+
10
+ ## Usage
11
+ ```python
12
+ # Load the vocoder model
13
+ vocoder = torch.load('vocoder.pt')
14
+ vocoder.eval()
15
+
16
+ # Generate audio from mel-spectrogram
17
+ with torch.no_grad():
18
+ audio = vocoder(mel_spectrogram)
19
+ ```
20
+
21
+ ## Dependencies
22
+ - PyTorch
23
+ - NumPy
24
+ - Audio processing libraries (for waveform handling)
25
+
26
+ ## Model Configuration
27
+ See `config.json` for model architecture and training parameters.
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "hifigan",
3
+ "sample_rate": 22050,
4
+ "num_mels": 80,
5
+ "num_freq": 1025,
6
+ "n_fft": 1024,
7
+ "hop_size": 256,
8
+ "win_size": 1024,
9
+ "fmin": 0.0,
10
+ "fmax": 8000.0,
11
+ "upsample_rates": [8, 8, 2, 2],
12
+ "upsample_kernel_sizes": [16, 16, 4, 4],
13
+ "upsample_initial_channel": 512,
14
+ "resblock_kernel_sizes": [3, 7, 11],
15
+ "resblock_dilation_sizes": [
16
+ [1, 3, 5],
17
+ [1, 3, 5],
18
+ [1, 3, 5]
19
+ ],
20
+ "resblock_type": "1",
21
+ "use_spectral_norm": false,
22
+ "version": "1.0",
23
+ "authors": ["Arjit"],
24
+ "description": "HiFi-GAN vocoder for high-fidelity audio waveform generation from mel-spectrograms"
25
+ }
26
+
vocoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d7a6861589e927e0fbdaa5849ca022258fe2b58a20cc7bfb8fb598ccf936169
3
+ size 53845290