--- title: MUSEval Leaderboard emoji: 🏆 colorFrom: pink colorTo: indigo sdk: gradio sdk_version: 5.49.0 app_file: app.py pinned: false short_description: Leaderboard for MUSEval Dataset license: mit --- # 🏆 MUSEval Leaderboard Welcome to the MUSEval (Multivariate Time Series Dataset) Leaderboard! This leaderboard tracks and compares the performance of different models on multivariate time series forecasting tasks. ## 📊 About MUSEval MUSEval is a comprehensive multivariate time series dataset designed for forecasting tasks. The dataset contains multiple time series with various characteristics and complexities, making it an ideal benchmark for evaluating time series forecasting models. ## 🎯 Evaluation Metrics The leaderboard uses the following metrics to evaluate model performance: - **MAE (Mean Absolute Error)**: Average absolute difference between predicted and actual values - **Uni-MAE (Univariate MAE)**: Univariate mean absolute error for comparison - **RMSE (Root Mean Square Error)**: Square root of the average squared differences - **MAPE (Mean Absolute Percentage Error)**: Average percentage error - **R² (Coefficient of Determination)**: Proportion of variance explained by the model - **SMAPE (Symmetric Mean Absolute Percentage Error)**: Symmetric version of MAPE - **Uni-Multi (Univariate-Multivariate)**: Comparison metric between univariate and multivariate approaches ## 🚀 How to Use ### Viewing the Leaderboard 1. Navigate to the "📈 Overall Leaderboard" tab to see current model rankings 2. Use domain and dataset filters to focus on specific categories 3. Models are ranked by MAE (lower is better) 4. Click "🔄 Refresh Leaderboard" to update the display ### Category-Based Evaluation - **🏢 By Domain**: View performance across different domains (finance, energy, healthcare, general) - **📊 By Dataset**: Compare models on specific dataset variants - Each category shows specialized rankings and metrics ### Submitting a Model 1. Go to the "📝 Submit Model" tab 2. Fill in your model details and all performance metrics 3. Select the appropriate domain and dataset 4. Provide links to papers and code (optional but recommended) 5. Click "🚀 Submit Model" to add your results ### Dataset Information - Visit the "📋 Dataset Info" tab for detailed information about MUSEval - Check submission guidelines and evaluation protocols ### Statistics - The "📊 Statistics" tab provides summary statistics about all submissions ## 🔧 Technical Details ### Requirements - Python 3.8+ - Gradio 5.49.0 - Pandas 1.5.0+ - NumPy 1.21.0+ ### Local Development ```bash pip install -r requirements.txt python app.py ``` ## 📝 Submission Guidelines 1. **Accuracy**: Provide accurate performance metrics from your evaluation 2. **Reproducibility**: Include links to code and papers when available 3. **Ethics**: Follow ethical AI practices and cite relevant work 4. **Validation**: Ensure your results are reproducible ## 🤝 Contributing Contributions to improve the leaderboard are welcome! Please: - Report issues or bugs - Suggest new features - Improve documentation - Add new evaluation metrics if relevant ## 📄 License This project is licensed under the MIT License. See the LICENSE file for details. ## 📞 Contact For questions about the dataset, leaderboard, or submissions, please contact the maintainers. --- **Note**: This leaderboard is for research purposes. Please evaluate and address potential concerns related to accuracy, safety, and fairness before deploying models in production.