astroza's picture
Update leaderboard configuration and results processing for Chilean Spanish ASR evaluation
13a06cd
|
raw
history blame
2.63 kB
metadata
title: Open Asr Leaderboard CL
emoji: πŸ₯‡
colorFrom: green
colorTo: indigo
sdk: gradio
app_file: app.py
pinned: true
license: apache-2.0
short_description: Open ASR Leaderboard for Chilean Spanish
sdk_version: 4.44.0
tags:
  - leaderboard

Chilean Spanish ASR Leaderboard

Simple Gradio-based leaderboard displaying ASR evaluation results for Chilean Spanish models.

Quick Start

This is a simplified version that displays results from a CSV file with two tabs:

  • πŸ… Chilean Spanish ASR Leaderboard: Shows model rankings based on WER and RTFx metrics
  • πŸ“ About: Detailed information about the evaluation methodology and datasets

Running the Leaderboard

# Clone the repository
git clone https://github.com/aastroza/open_asr_leaderboard_cl.git
cd open_asr_leaderboard_cl

# Install dependencies
pip install gradio pandas

# Run the application
python app.py

The application will load results from results.csv and display them in a simple, clean interface.

Results Format

The results.csv file should contain the following columns:

  • model_id: The model identifier (e.g., "openai/whisper-large-v3")
  • wer: Word Error Rate (lower is better)
  • rtfx: Real-Time Factor (higher is better)
  • Additional metadata columns (dataset, num_samples, etc.)

Configuration

  • Title and Content: Edit src/about.py to modify the title, introduction text, and about section
  • Styling: Customize appearance in src/display/css_html_js.py
  • Data Processing: Modify the load_results() function in app.py to change how results are aggregated and displayed

About the Evaluation

This leaderboard evaluates ASR models on Chilean Spanish using three datasets:

  • Common Voice (Chilean Spanish subset)
  • Google Chilean Spanish
  • Datarisas

Models are ranked by average Word Error Rate (WER) across all datasets, with Real-Time Factor (RTFx) as a secondary metric for inference speed.

Models Evaluated

  • openai/whisper-large-v3
  • openai/whisper-large-v3-turbo
  • openai/whisper-small
  • rcastrovexler/whisper-small-es-cl (Chilean Spanish fine-tuned)
  • nvidia/canary-1b-v2
  • nvidia/parakeet-tdt-0.6b-v3
  • microsoft/Phi-4-multimodal-instruct
  • mistralai/Voxtral-Mini-3B-2507
  • elevenlabs/scribe_v1

For detailed methodology and complete evaluation framework, see the Modal-based evaluation code in the original repository.

Citation

@misc{astroza2024chilean,
  title={Chilean Spanish ASR Test Dataset},
  author={Alonso Astroza},
  year={2025},
  howpublished={\url{https://huggingface.co/datasets/astroza/es-cl-asr-test-only}}
}