BUT-FIT
/

diarizen-wavlm-large-s80-md

@@ -1,5 +1,5 @@
 ---
-license: mit
 library_name: transformers
 pipeline_tag: voice-activity-detection
 tags:
@@ -16,7 +16,7 @@ tags:
 ## Overview
 This hub features the pre-trained model by [DiariZen](https://github.com/BUTSpeechFIT/DiariZen). The EEND component is built upon WavLM Large and Conformer layers. The model was trained on far-field, single-channel audio from a diverse set of public datasets, including AMI, AISHELL-4, AliMeeting, NOTSOFAR-1, MSDWild, DIHARD3, RAMC, and VoxConverse.
-Then structured pruning at 80% sparsity is applied. After pruning, the number of parameters in WavLM Large is reduced from **316.6M to 63.3M**, and the computational cost (MACs) decreases from **17.8G to 3.8G** per second.
@@ -79,3 +79,7 @@ If you found this work helpful, please consider citing:
 }
 ```

 ---
+license: cc-by-nc-4.0
 library_name: transformers
 pipeline_tag: voice-activity-detection
 tags:
 ## Overview
 This hub features the pre-trained model by [DiariZen](https://github.com/BUTSpeechFIT/DiariZen). The EEND component is built upon WavLM Large and Conformer layers. The model was trained on far-field, single-channel audio from a diverse set of public datasets, including AMI, AISHELL-4, AliMeeting, NOTSOFAR-1, MSDWild, DIHARD3, RAMC, and VoxConverse.
+Then structured pruning at 80% sparsity is applied. After pruning, the number of parameters in WavLM Large is reduced from **316.6M to 63.3M**, and the computational cost (MACs) decreases from **17.8G to 3.8G** per second. When loading this model, please ensure **non-commercial** usage, in accordance with the CC BY-NC 4.0 license.
 }
 ```
+## License
+- **Source code**: MIT (see the [project’s GitHub repository](https://github.com/BUTSpeechFIT/DiariZen)).
+- **Model weights**: CC BY-NC 4.0 (non-commercial).
+- Rationale: some training datasets are research-only or non-commercial, so the released weights cannot be used commercially.