Spaces:
Running
Running
| title: README | |
| emoji: ๐ | |
| colorFrom: purple | |
| colorTo: blue | |
| sdk: static | |
| pinned: false | |
| short_description: 'NeLF Project: Next Level Flemish Speech Processing' | |
| # NeLF Project | |
| Welcome to the official HuggingFace page of the **NeLF Project: Next Level Flemish Speech Processing**. | |
| On this page, you can find all the state-of-the-art Flemish Dutch speech models that have been created by researchers of KU Leuven and UGent as part of the NeLF project. | |
| For more information about NeLF and the research, visit [**our website**](nelfproject.be). | |
| ## Models | |
| We host several models, which are specifically tailored to the processing of Flemish Dutch speech. Further details and instructions for usage of the models can be found in the respective repositories. | |
| ### Automatic Speech Recognition (ASR) | |
| -- [**NeLF_S2T_Pytorch**](https://huggingface.co/nelfproject/NeLF_S2T_Pytorch) (Recommended): The third version of our Automatic Speech Recognition and Subtitle Generation model. It is a fine-tuned version of ASR_subtitles_v2 without Kaldi-dependency (pure Pytorch), and refined training data leveraging contextualisation techniques for pseudo-labeling. | |
| -- [**ASR_subtitles_v2**](https://huggingface.co/nelfproject/ASR_subtitles_v2): The second version of our Automatic Speech Recognition and Subtitle Generation model, with improved architecture and trained on 14000 hours of Flemish broadcast subtitled speech data. | |
| It can generate both an exact verbatim transcription with annotation tags as well as a fully formatted and cleaned up subtitle transcription. | |
| -- [**ASR_subtitles_v2_small**](https://huggingface.co/nelfproject/ASR_subtitles_v2_small): Smaller variant of ASR_subtitles_v2 with almost as good performance. | |
| -- [**ASR_subtitles_v1**](https://huggingface.co/nelfproject/ASR_subtitles_v1): The first version of the ASR and Subtitling model trained on 1000 hours of Flemish data. | |
| -- [**ASR_verbatim_v1**](https://huggingface.co/nelfproject/ASR_verbatim_v1): The first version of the ASR and Subtitling model trained on 1000 hours of Flemish data, converted to a verbatim-only ASR model. | |
| -- **Whisper**: A finetuned Whisper Large model on Flemish data can be found [here](https://huggingface.co/kul-speech-lab/whisper_large_CGN). Usage instructions can be found in Whisper documentation. | |
| **USAGE**: To use our ASR models and transcribe speech yourself, use [our Github](https://github.com/nelfproject/NeLF_Speech2Text_Pytorch) for NeLF_S2T_Pytorch or [our codebase](https://github.com/nelfproject/NeLF_Transcription_ASR) for previous versions. | |
| ### Speaker Diarization and Identification | |
| -- **ecapa2_diarization**: Will be added shortly. | |
| ## Leaderboard | |
| Word Error Rates on different test sets. | |
| |Model Tag|Number of Parameters|Test CGN|Test Media| | |
| |:---|:---:|:---:|:---:| | |
| |NeLF_S2T_Pytorch|180M|6.65|8.23| | |
| |ASR_subtitles_v2|180M|6.49|8.63| | |
| |ASR_subtitles_v2_small|70M|6.93|9.30| | |
| |Whisper large finetuned|1550M|7.83|10.64| | |
| |Whisper large v3|1550M|11.54|13.76| | |
| ## Research | |
| Details on the models, training data and experiments can be found in the following research paper. | |
| If you use our ASR models, please consider citing it. | |
| ```bibtex | |
| @article{poncelet2024, | |
| author = "Poncelet, Jakob and Van hamme, Hugo", | |
| title = "Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling", | |
| year={2024}, | |
| journal={arXiv preprint arXiv:2502.03212}, | |
| url = {https://arxiv.org/abs/2502.03212} | |
| } | |
| ``` |