File size: 13,331 Bytes
c5a5a1d a3116de baded2f 4a16168 c5a5a1d e4b0c31 a3116de 6dbab6f c5a5a1d a3116de 44e697d 6982f0b a3116de 6982f0b a3116de 6982f0b a18d50d 6982f0b a18d50d 6982f0b a18d50d 6982f0b a18d50d 6982f0b a18d50d 228f78e 6982f0b a3116de 846c324 1d46a37 846c324 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b 86f2cd8 6982f0b 86f2cd8 6982f0b 86f2cd8 6982f0b 86f2cd8 6982f0b 86f2cd8 6982f0b 86f2cd8 6982f0b 86f2cd8 3378120 6982f0b 86f2cd8 6982f0b 86f2cd8 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e 6982f0b 228f78e c1a84e8 228f78e 6982f0b 228f78e 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b 64af94c 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 846c324 a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de c1a84e8 a3116de 38fcab6 a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b a3116de 6982f0b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 |
---
title: TraceMind MCP Server
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: true
license: agpl-3.0
short_description: MCP server for agent evaluation with Gemini 2.5 Flash
tags:
- building-mcp-track-enterprise
- mcp
- gradio
- gemini
- agent-evaluation
- leaderboard
---
# TraceMind MCP Server
<p align="center">
<img src="https://raw.githubusercontent.com/Mandark-droid/TraceMind-mcp-server/assets/Logo.png" alt="TraceMind MCP Server Logo" width="200"/>
</p>
**AI-Powered Analysis Tools for Agent Evaluation**
[](https://github.com/modelcontextprotocol)
[-blue)](https://github.com/modelcontextprotocol/hackathon)
[](https://ai.google.dev/)
> **π― Track 1 Submission**: Building MCP (Enterprise)
> **π
MCP's 1st Birthday Hackathon**: November 14-30, 2025
---
## Why This MCP Server?
**Problem**: Agent evaluation generates mountains of dataβleaderboards, traces, metricsβbut developers struggle to extract actionable insights.
**Solution**: This MCP server provides **11 AI-powered tools** that transform raw evaluation data into clear answers:
- *"Which model is best for my use case?"*
- *"Why did this agent execution fail?"*
- *"How much will this evaluation cost?"*
**Powered by Google Gemini 2.5 Flash** for intelligent, context-aware analysis of agent performance data.
---
## π Quick Links
- **π Live Demo**: [TraceMind-mcp-server Space](https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind-mcp-server)
- **β‘ Auto-Config**: Add `MCP-1st-Birthday/TraceMind-mcp-server` at https://huggingface.co/settings/mcp
- **π Full Docs**: See [DOCUMENTATION.md](DOCUMENTATION.md) for complete technical reference
- **π¬ Quick Demo (5 min)**: [Watch on Loom](https://www.loom.com/share/d4d0003f06fa4327b46ba5c081bdf835)
- **πΊ Full Demo (20 min)**: [Watch on Loom](https://www.loom.com/share/de559bb0aef749559c79117b7f951250)
### Social Media
Read the announcement and join the discussion:
- **π [Blog Post]**: [Building TraceMind Ecosystem](https://huggingface.co/blog/kshitijthakkar/tracemind-ecosystem) - Complete technical deep-dive into the TraceVerse ecosystem
- **[Twitter/X post link]** : [View on X](https://x.com/Mandark12921244/status/1993279134156607594?s=20)
- **[LinkedIn post link]**: [View on LinkedIn](https://www.linkedin.com/posts/kshitij-thakkar-2061b924_mcp-modelcontextprotocol-aiagents-activity-7399052013524647936-wgkA)
- **[HuggingFace Discord announcement link]**: [Read on discord](https://discord.com/channels/879548962464493619/1439001549492719726/1442838638307180656)
**MCP Endpoints**:
- SSE (Recommended): `https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse`
- Streamable HTTP: `https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/`
---
## The TraceMind Ecosystem
This MCP server is part of a **complete agent evaluation platform** built from four interconnected projects:
<p align="center">
<img src="https://raw.githubusercontent.com/Mandark-droid/TraceMind-AI/assets/TraceVerse_Logo.png" alt="TraceVerse Ecosystem" width="400"/>
</p>
```
π TraceVerde π SMOLTRACE
(genai_otel_instrument) (Evaluation Engine)
β β
Instruments Evaluates
LLM calls agents
β β
βββββββββββββ¬ββββββββββββββββββββ
β
Generates Datasets
(leaderboard, traces, metrics)
β
βββββββββββββ΄ββββββββββββββββββββ
β β
π οΈ TraceMind MCP Server π§ TraceMind-AI
(This Project - Track 1) (UI Platform - Track 2)
Analyzes with AI Visualizes & Interacts
```
### The Foundation
**π TraceVerde** - Zero-code OpenTelemetry instrumentation for LLM frameworks
β [GitHub](https://github.com/Mandark-droid/genai_otel_instrument) | [PyPI](https://pypi.org/project/genai-otel-instrument)
**π SMOLTRACE** - Lightweight evaluation engine that generates structured datasets
β [GitHub](https://github.com/Mandark-droid/SMOLTRACE) | [PyPI](https://pypi.org/project/smoltrace/)
### The Platform
**π οΈ TraceMind MCP Server** (This Project) - Provides MCP tools for AI-powered analysis
β **Track 1**: Building MCP (Enterprise)
β [Live Demo](https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind-mcp-server) | [GitHub](https://github.com/Mandark-droid/TraceMind-mcp-server)
**π§ TraceMind-AI** - Gradio UI that consumes MCP tools for interactive evaluation
β [Live Demo](https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind) | [GitHub](https://github.com/Mandark-droid/TraceMind-AI)
β **Track 2**: MCP in Action (Enterprise)
---
## Why This Matters for Hugging Face
This ecosystem is built **around** Hugging Face, not just "using it":
- Every SMOLTRACE evaluation creates **4 structured `datasets` on the Hub** (leaderboard, results, traces, metrics)
- TraceMind MCP Server and TraceMind-AI run as **Hugging Face Spaces**, using **Gradio's MCP integration**
- The stack is designed for **`smolagents`** β agents are evaluated, traced, and analyzed using HF's own agent framework
- Evaluations can be executed via **HF Jobs**, turning evaluations into real compute usage, not just local scripts
So TraceMind isn't just another MCP server demo.
**It's an opinionated blueprint for:**
> **"How Hugging Face models + Datasets + Spaces + Jobs + smolagents + MCP can work together as a complete agent evaluation and observability platform."**
---
## What's Included
### 11 AI-Powered Tools
**Core Analysis** (AI-Powered by Gemini 2.5 Flash):
1. **π analyze_leaderboard** - Generate insights from evaluation data
2. **π debug_trace** - Debug agent execution traces with AI assistance
3. **π° estimate_cost** - Predict costs before running evaluations
4. **βοΈ compare_runs** - Compare two evaluation runs with AI analysis
5. **π analyze_results** - Analyze detailed test results with optimization recommendations
**Token-Optimized Tools**:
6. **π get_top_performers** - Get top N models (90% token reduction vs. full dataset)
7. **π get_leaderboard_summary** - High-level statistics (99% token reduction)
**Data Management**:
8. **π¦ get_dataset** - Load SMOLTRACE datasets as JSON
9. **π§ͺ generate_synthetic_dataset** - Create domain-specific test datasets with AI (up to 100 tasks)
10. **π€ push_dataset_to_hub** - Upload datasets to HuggingFace
11. **π generate_prompt_template** - Generate customized smolagents prompt templates
### 3 Data Resources
Direct JSON access without AI analysis:
- **leaderboard://{repo}** - Raw evaluation results
- **trace://{trace_id}/{repo}** - OpenTelemetry spans
- **cost://model/{model}** - Pricing information
### 3 Prompt Templates
Standardized templates for consistent analysis:
- **analysis_prompt** - Different analysis types (leaderboard, cost, performance)
- **debug_prompt** - Debugging scenarios
- **optimization_prompt** - Optimization goals
**Total: 17 MCP Components** (11 + 3 + 3)
---
## Quick Start
### 1. Connect to the Live Server
**Easiest Method** (Recommended):
1. Visit https://huggingface.co/settings/mcp (while logged in)
2. Add Space: `MCP-1st-Birthday/TraceMind-mcp-server`
3. Select your MCP client (Claude Desktop, VSCode, Cursor, etc.)
4. Copy the auto-generated config and paste into your client
**Manual Configuration** (Advanced):
For Claude Desktop (`claude_desktop_config.json`):
```json
{
"mcpServers": {
"tracemind": {
"url": "https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse",
"transport": "sse"
}
}
}
```
For VSCode/Cursor (`settings.json`):
```json
{
"mcp.servers": {
"tracemind": {
"url": "https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/",
"transport": "streamable-http"
}
}
}
```
### 2. Try It Out
Open your MCP client and try:
```
"Analyze the leaderboard at kshitijthakkar/smoltrace-leaderboard and show me the top 5 models"
```
You should see AI-powered insights generated by Gemini 2.5 Flash!
### 3. Using Your Own API Keys (Recommended)
To avoid rate limits during evaluation:
1. Visit the [MCP Server Space](https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind-mcp-server)
2. Go to **βοΈ Settings** tab
3. Enter your **Gemini API Key** and **HuggingFace Token**
4. Click **"Save & Override Keys"**
**Get Free API Keys**:
- **Gemini**: https://ai.google.dev/ (1,500 requests/day free)
- **HuggingFace**: https://huggingface.co/settings/tokens (unlimited for public datasets)
---
## For Hackathon Judges
### β
Track 1 Compliance
- **Complete MCP Implementation**: 11 Tools + 3 Resources + 3 Prompts (17 total)
- **MCP Standard Compliant**: Built with Gradio's native `@gr.mcp.*` decorators
- **Production-Ready**: Deployed to HuggingFace Spaces with SSE transport
- **Enterprise Focus**: Cost optimization, debugging, decision support
- **Google Gemini Powered**: All AI analysis uses Gemini 2.5 Flash
- **Interactive Testing**: Beautiful Gradio UI for testing all components
### π― Key Innovations
1. **Token Optimization**: `get_top_performers` and `get_leaderboard_summary` reduce token usage by 90-99%
2. **AI-Powered Synthetic Data**: Generate domain-specific test datasets + matching prompt templates
3. **Complete Ecosystem**: Part of 4-project platform with TraceVerde β SMOLTRACE β MCP Server β TraceMind-AI
4. **Real Data Integration**: Works with live HuggingFace datasets from SMOLTRACE evaluations
5. **Test Results Analysis**: Deep-dive into individual test cases with `analyze_results` tool
### πΉ Demo Materials
1. **π¬ Quick Demo (5 min)**: [Watch on Loom](https://www.loom.com/share/d4d0003f06fa4327b46ba5c081bdf835)
2. **πΊ Full Demo (20 min)**: [Watch on Loom](https://www.loom.com/share/de559bb0aef749559c79117b7f951250)
- **[Twitter/X post link]** : [View on X](https://x.com/Mandark12921244/status/1993279134156607594?s=20)
- **[LinkedIn post link]**: [View on LinkedIn](https://www.linkedin.com/posts/kshitij-thakkar-2061b924_mcp-modelcontextprotocol-aiagents-activity-7399052013524647936-wgkA)
- **[HuggingFace Discord announcement link]**: [Read on discord](https://discord.com/channels/879548962464493619/1439001549492719726/1442838638307180656)
---
## Documentation
**For quick evaluation**:
- Read this README for overview
- Visit the [Live Demo](https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind-mcp-server) to test tools
- Use the Auto-Config link to connect your MCP client
**For deep dives**:
- [DOCUMENTATION.md](DOCUMENTATION.md) - Complete API reference
- Tool descriptions and parameters
- Resource URIs and schemas
- Prompt template details
- Example use cases
- [ARCHITECTURE.md](ARCHITECTURE.md) - Technical architecture
- Project structure
- MCP protocol implementation
- Gemini integration details
- Deployment guide
---
## Technology Stack
- **AI Model**: Google Gemini 2.5 Flash (via Google AI SDK)
- **MCP Framework**: Gradio 6 with native MCP support (`@gr.mcp.*` decorators)
- **Data Source**: HuggingFace Datasets API
- **Transport**: SSE (recommended) + Streamable HTTP
- **Deployment**: HuggingFace Spaces (Docker SDK)
---
## Run Locally (Optional)
```bash
# Clone and setup
git clone https://github.com/Mandark-droid/TraceMind-mcp-server.git
cd TraceMind-mcp-server
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Configure API keys
cp .env.example .env
# Edit .env with your GEMINI_API_KEY and HF_TOKEN
# Run the server
python app.py
```
Visit http://localhost:7860 to test the tools via Gradio UI.
---
## Related Projects
**π§ TraceMind-AI** (Track 2 - MCP in Action):
- Live Demo: https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind
- Consumes this MCP server for AI-powered agent evaluation UI
- Features autonomous agent chat, trace visualization, job submission
**π Foundation Libraries**:
- TraceVerde: https://github.com/Mandark-droid/genai_otel_instrument
- SMOLTRACE: https://github.com/Mandark-droid/SMOLTRACE
---
## Credits
**Built for**: MCP's 1st Birthday Hackathon (Nov 14-30, 2025)
**Track**: Building MCP (Enterprise)
**Author**: Kshitij Thakkar
**Powered by**: Google Gemini 2.5 Flash
**Built with**: Gradio (native MCP support)
**Sponsors**: HuggingFace β’ Google Gemini β’ Modal β’ Anthropic β’ Gradio β’ OpenAI β’ Nebius β’ Hyperbolic β’ ElevenLabs β’ SambaNova β’ Blaxel
---
## License
AGPL-3.0 - See [LICENSE](LICENSE) for details
---
## Support
- π§ GitHub Issues: [TraceMind-mcp-server/issues](https://github.com/Mandark-droid/TraceMind-mcp-server/issues)
- π¬ HF Discord: `#mcp-1st-birthday-officialπ`
- π·οΈ Tag: `building-mcp-track-enterprise`
|