Spaces:

nelfproject
/

README

Running

App Files Files Community

README / README.md

Jakobkee

Update README.md

d7a74ba verified about 2 months ago

preview code

raw

history blame contribute delete

3.46 kB

	---
	title: README
	emoji: 🚀
	colorFrom: purple
	colorTo: blue
	sdk: static
	pinned: false
	short_description: 'NeLF Project: Next Level Flemish Speech Processing'
	---

	# NeLF Project

	Welcome to the official HuggingFace page of the NeLF Project: Next Level Flemish Speech Processing.

	On this page, you can find all the state-of-the-art Flemish Dutch speech models that have been created by researchers of KU Leuven and UGent as part of the NeLF project.

	For more information about NeLF and the research, visit [our website](nelfproject.be).



	## Models

	We host several models, which are specifically tailored to the processing of Flemish Dutch speech. Further details and instructions for usage of the models can be found in the respective repositories.


	### Automatic Speech Recognition (ASR)

	-- [NeLF_S2T_Pytorch](https://huggingface.co/nelfproject/NeLF_S2T_Pytorch) (Recommended): The third version of our Automatic Speech Recognition and Subtitle Generation model. It is a fine-tuned version of ASR_subtitles_v2 without Kaldi-dependency (pure Pytorch), and refined training data leveraging contextualisation techniques for pseudo-labeling.

	-- [ASR_subtitles_v2](https://huggingface.co/nelfproject/ASR_subtitles_v2): The second version of our Automatic Speech Recognition and Subtitle Generation model, with improved architecture and trained on 14000 hours of Flemish broadcast subtitled speech data.
	It can generate both an exact verbatim transcription with annotation tags as well as a fully formatted and cleaned up subtitle transcription.

	-- [ASR_subtitles_v2_small](https://huggingface.co/nelfproject/ASR_subtitles_v2_small): Smaller variant of ASR_subtitles_v2 with almost as good performance.

	-- [ASR_subtitles_v1](https://huggingface.co/nelfproject/ASR_subtitles_v1): The first version of the ASR and Subtitling model trained on 1000 hours of Flemish data.

	-- [ASR_verbatim_v1](https://huggingface.co/nelfproject/ASR_verbatim_v1): The first version of the ASR and Subtitling model trained on 1000 hours of Flemish data, converted to a verbatim-only ASR model.

	-- Whisper: A finetuned Whisper Large model on Flemish data can be found [here](https://huggingface.co/kul-speech-lab/whisper_large_CGN). Usage instructions can be found in Whisper documentation.

	USAGE: To use our ASR models and transcribe speech yourself, use [our Github](https://github.com/nelfproject/NeLF_Speech2Text_Pytorch) for NeLF_S2T_Pytorch or [our codebase](https://github.com/nelfproject/NeLF_Transcription_ASR) for previous versions.

	### Speaker Diarization and Identification

	-- ecapa2_diarization: Will be added shortly.


	## Leaderboard

	Word Error Rates on different test sets.

	\|Model Tag\|Number of Parameters\|Test CGN\|Test Media\|
	\|:---\|:---:\|:---:\|:---:\|
	\|NeLF_S2T_Pytorch\|180M\|6.65\|8.23\|
	\|ASR_subtitles_v2\|180M\|6.49\|8.63\|
	\|ASR_subtitles_v2_small\|70M\|6.93\|9.30\|
	\|Whisper large finetuned\|1550M\|7.83\|10.64\|
	\|Whisper large v3\|1550M\|11.54\|13.76\|


	## Research

	Details on the models, training data and experiments can be found in the following research paper.
	If you use our ASR models, please consider citing it.

	```bibtex
	@article{poncelet2024,
	author = "Poncelet, Jakob and Van hamme, Hugo",
	title = "Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling",
	year={2024},
	journal={arXiv preprint arXiv:2502.03212},
	url = {https://arxiv.org/abs/2502.03212}
	}
	```