--- title: TraceMind AI emoji: 🔍 colorFrom: indigo colorTo: purple sdk: gradio sdk_version: 5.49.1 app_file: app.py short_description: AI agent evaluation with MCP-powered intelligence pinned: false tags: - mcp-in-action-track-enterprise - agent-evaluation - mcp-client - leaderboard - gradio --- # 🔍 TraceMind-AI Agent Evaluation Platform with MCP-Powered Intelligence ## Overview TraceMind-AI is a comprehensive platform for evaluating AI agent performance across different models, providers, and configurations. It provides real-time insights, cost analysis, and detailed trace visualization powered by the Model Context Protocol (MCP). ## Features - **📊 Real-time Leaderboard**: Live evaluation data from HuggingFace datasets - **🤖 MCP Integration**: AI-powered analysis using remote MCP servers - **💰 Cost Estimation**: Calculate evaluation costs for different models and configurations - **🔍 Trace Visualization**: Detailed OpenTelemetry trace analysis - **📈 Performance Metrics**: GPU utilization, CO2 emissions, token usage tracking ## MCP Integration TraceMind-AI demonstrates enterprise MCP client usage by connecting to [TraceMind-mcp-server](https://huggingface.co/spaces/kshitijthakkar/TraceMind-mcp-server) via the Model Context Protocol. **MCP Tools Used:** - `analyze_leaderboard` - AI-generated insights about evaluation trends - `estimate_cost` - Cost estimation with hardware recommendations - `debug_trace` - Interactive trace analysis and debugging - `compare_runs` - Side-by-side run comparison - `analyze_results` - Test case analysis with optimization recommendations ## Quick Start ### Prerequisites - Python 3.10+ - HuggingFace account (for authentication) - HuggingFace token (optional, for private datasets) ### Installation 1. Clone the repository: ```bash git clone https://github.com/Mandark-droid/TraceMind-AI.git cd TraceMind-AI ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Configure environment: ```bash cp .env.example .env # Edit .env with your configuration ``` 4. Run the application: ```bash python app.py ``` Visit http://localhost:7860 ## Configuration Create a `.env` file with the following variables: ```env # HuggingFace Configuration HF_TOKEN=your_token_here # MCP Server URL MCP_SERVER_URL=https://kshitijthakkar-tracemind-mcp-server.hf.space/gradio_api/mcp/ # Dataset Configuration LEADERBOARD_REPO=kshitijthakkar/smoltrace-leaderboard # Development Mode (optional - disables OAuth for local testing) DISABLE_OAUTH=true ``` ## Data Sources TraceMind-AI loads evaluation data from HuggingFace datasets: - **Leaderboard**: Aggregate statistics for all evaluation runs - **Results**: Individual test case results - **Traces**: OpenTelemetry trace data - **Metrics**: GPU metrics and performance data ## Architecture ### Project Structure ``` TraceMind-AI/ ├── app.py # Main Gradio application ├── data_loader.py # HuggingFace dataset integration ├── mcp_client/ # MCP client implementation │ ├── client.py # Async MCP client │ └── sync_wrapper.py # Synchronous wrapper ├── utils/ # Utilities │ ├── auth.py # HuggingFace OAuth │ └── navigation.py # Screen navigation ├── screens/ # UI screens ├── components/ # Reusable components └── styles/ # Custom CSS ``` ### MCP Client Integration TraceMind-AI uses the MCP Python SDK to connect to remote MCP servers: ```python from mcp_client.sync_wrapper import get_sync_mcp_client # Initialize MCP client mcp_client = get_sync_mcp_client() mcp_client.initialize() # Call MCP tools insights = mcp_client.analyze_leaderboard( metric_focus="overall", time_range="last_week", top_n=5 ) ``` ## Usage ### Viewing the Leaderboard 1. Log in with your HuggingFace account 2. Navigate to the "Leaderboard" tab 3. Click "Load Leaderboard" to fetch the latest data 4. View AI-powered insights generated by the MCP server ### Estimating Costs 1. Navigate to the "Cost Estimator" tab 2. Enter the model name (e.g., `openai/gpt-4`) 3. Select agent type and number of tests 4. Click "Estimate Cost" for AI-powered analysis ### Viewing Trace Details 1. Select an evaluation run from the leaderboard 2. Click on a specific test case 3. View detailed OpenTelemetry trace visualization 4. Ask questions about the trace using MCP-powered analysis ## Technology Stack - **UI Framework**: Gradio 5.49.1 - **MCP Protocol**: MCP integration via Gradio - **Data**: HuggingFace Datasets API - **Authentication**: HuggingFace OAuth - **AI**: Google Gemini 2.5 Flash (via MCP server) ## Development ### Running Locally ```bash # Install dependencies pip install -r requirements.txt # Set development mode (optional - disables OAuth) export DISABLE_OAUTH=true # Run the app python app.py ``` ### Running on HuggingFace Spaces This application is configured for deployment on HuggingFace Spaces using the Gradio SDK. The `app.py` file serves as the entry point. ## Documentation For detailed implementation documentation, see: - [Data Loader API](data_loader.py) - Dataset loading and caching - [MCP Client API](mcp_client/client.py) - MCP protocol integration - [Authentication](utils/auth.py) - HuggingFace OAuth integration ## Demo Video [Link to demo video showing the application in action] ## Social Media [Link to social media post about this project] ## License MIT License - See LICENSE file for details ## Contributing Contributions are welcome! Please open an issue or submit a pull request. ## Acknowledgments - **MCP Team** - For the Model Context Protocol specification - **Gradio Team** - For Gradio 6 with MCP integration - **HuggingFace** - For Spaces hosting and dataset infrastructure - **Google** - For Gemini API access ## Links - **Live Demo**: https://huggingface.co/spaces/kshitijthakkar/TraceMind-AI - **MCP Server**: https://huggingface.co/spaces/kshitijthakkar/TraceMind-mcp-server - **GitHub**: https://github.com/Mandark-droid/TraceMind-AI - **MCP Specification**: https://modelcontextprotocol.io --- **MCP's 1st Birthday Hackathon Submission** *Track: MCP in Action - Enterprise*