kshitijthakkar's picture
docs: Add HuggingFace platform value proposition to README
3378120
metadata
title: TraceMind MCP Server
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: true
license: agpl-3.0
short_description: MCP server for agent evaluation with Gemini 2.5 Flash
tags:
  - building-mcp-track-enterprise
  - mcp
  - gradio
  - gemini
  - agent-evaluation
  - leaderboard

TraceMind MCP Server

TraceMind MCP Server Logo

AI-Powered Analysis Tools for Agent Evaluation

MCP's 1st Birthday Hackathon Track 1: Building MCP Powered by Google Gemini

🎯 Track 1 Submission: Building MCP (Enterprise) πŸ“… MCP's 1st Birthday Hackathon: November 14-30, 2025


Why This MCP Server?

Problem: Agent evaluation generates mountains of dataβ€”leaderboards, traces, metricsβ€”but developers struggle to extract actionable insights.

Solution: This MCP server provides 11 AI-powered tools that transform raw evaluation data into clear answers:

  • "Which model is best for my use case?"
  • "Why did this agent execution fail?"
  • "How much will this evaluation cost?"

Powered by Google Gemini 2.5 Flash for intelligent, context-aware analysis of agent performance data.


πŸ”— Quick Links

Social Media

Read the announcement and join the discussion:

MCP Endpoints:

  • SSE (Recommended): https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse
  • Streamable HTTP: https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/

The TraceMind Ecosystem

This MCP server is part of a complete agent evaluation platform built from four interconnected projects:

TraceVerse Ecosystem

πŸ”­ TraceVerde                    πŸ“Š SMOLTRACE
(genai_otel_instrument)         (Evaluation Engine)
        ↓                               ↓
    Instruments                    Evaluates
    LLM calls                      agents
        ↓                               ↓
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    ↓
            Generates Datasets
        (leaderboard, traces, metrics)
                    ↓
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        ↓                               ↓
πŸ› οΈ TraceMind MCP Server         🧠 TraceMind-AI
(This Project - Track 1)        (UI Platform - Track 2)
Analyzes with AI                Visualizes & Interacts

The Foundation

πŸ”­ TraceVerde - Zero-code OpenTelemetry instrumentation for LLM frameworks β†’ GitHub | PyPI

πŸ“Š SMOLTRACE - Lightweight evaluation engine that generates structured datasets β†’ GitHub | PyPI

The Platform

πŸ› οΈ TraceMind MCP Server (This Project) - Provides MCP tools for AI-powered analysis β†’ Track 1: Building MCP (Enterprise) β†’ Live Demo | GitHub

🧠 TraceMind-AI - Gradio UI that consumes MCP tools for interactive evaluation β†’ Live Demo | GitHub β†’ Track 2: MCP in Action (Enterprise)


Why This Matters for Hugging Face

This ecosystem is built around Hugging Face, not just "using it":

  • Every SMOLTRACE evaluation creates 4 structured datasets on the Hub (leaderboard, results, traces, metrics)
  • TraceMind MCP Server and TraceMind-AI run as Hugging Face Spaces, using Gradio's MCP integration
  • The stack is designed for smolagents – agents are evaluated, traced, and analyzed using HF's own agent framework
  • Evaluations can be executed via HF Jobs, turning evaluations into real compute usage, not just local scripts

So TraceMind isn't just another MCP server demo. It's an opinionated blueprint for:

"How Hugging Face models + Datasets + Spaces + Jobs + smolagents + MCP can work together as a complete agent evaluation and observability platform."


What's Included

11 AI-Powered Tools

Core Analysis (AI-Powered by Gemini 2.5 Flash):

  1. πŸ“Š analyze_leaderboard - Generate insights from evaluation data
  2. πŸ› debug_trace - Debug agent execution traces with AI assistance
  3. πŸ’° estimate_cost - Predict costs before running evaluations
  4. βš–οΈ compare_runs - Compare two evaluation runs with AI analysis
  5. πŸ“‹ analyze_results - Analyze detailed test results with optimization recommendations

Token-Optimized Tools: 6. πŸ† get_top_performers - Get top N models (90% token reduction vs. full dataset) 7. πŸ“ˆ get_leaderboard_summary - High-level statistics (99% token reduction)

Data Management: 8. πŸ“¦ get_dataset - Load SMOLTRACE datasets as JSON 9. πŸ§ͺ generate_synthetic_dataset - Create domain-specific test datasets with AI (up to 100 tasks) 10. πŸ“€ push_dataset_to_hub - Upload datasets to HuggingFace 11. πŸ“ generate_prompt_template - Generate customized smolagents prompt templates

3 Data Resources

Direct JSON access without AI analysis:

  • leaderboard://{repo} - Raw evaluation results
  • trace://{trace_id}/{repo} - OpenTelemetry spans
  • cost://model/{model} - Pricing information

3 Prompt Templates

Standardized templates for consistent analysis:

  • analysis_prompt - Different analysis types (leaderboard, cost, performance)
  • debug_prompt - Debugging scenarios
  • optimization_prompt - Optimization goals

Total: 17 MCP Components (11 + 3 + 3)


Quick Start

1. Connect to the Live Server

Easiest Method (Recommended):

  1. Visit https://huggingface.co/settings/mcp (while logged in)
  2. Add Space: MCP-1st-Birthday/TraceMind-mcp-server
  3. Select your MCP client (Claude Desktop, VSCode, Cursor, etc.)
  4. Copy the auto-generated config and paste into your client

Manual Configuration (Advanced):

For Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "tracemind": {
      "url": "https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse",
      "transport": "sse"
    }
  }
}

For VSCode/Cursor (settings.json):

{
  "mcp.servers": {
    "tracemind": {
      "url": "https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/",
      "transport": "streamable-http"
    }
  }
}

2. Try It Out

Open your MCP client and try:

"Analyze the leaderboard at kshitijthakkar/smoltrace-leaderboard and show me the top 5 models"

You should see AI-powered insights generated by Gemini 2.5 Flash!

3. Using Your Own API Keys (Recommended)

To avoid rate limits during evaluation:

  1. Visit the MCP Server Space
  2. Go to βš™οΈ Settings tab
  3. Enter your Gemini API Key and HuggingFace Token
  4. Click "Save & Override Keys"

Get Free API Keys:


For Hackathon Judges

βœ… Track 1 Compliance

  • Complete MCP Implementation: 11 Tools + 3 Resources + 3 Prompts (17 total)
  • MCP Standard Compliant: Built with Gradio's native @gr.mcp.* decorators
  • Production-Ready: Deployed to HuggingFace Spaces with SSE transport
  • Enterprise Focus: Cost optimization, debugging, decision support
  • Google Gemini Powered: All AI analysis uses Gemini 2.5 Flash
  • Interactive Testing: Beautiful Gradio UI for testing all components

🎯 Key Innovations

  1. Token Optimization: get_top_performers and get_leaderboard_summary reduce token usage by 90-99%
  2. AI-Powered Synthetic Data: Generate domain-specific test datasets + matching prompt templates
  3. Complete Ecosystem: Part of 4-project platform with TraceVerde β†’ SMOLTRACE β†’ MCP Server β†’ TraceMind-AI
  4. Real Data Integration: Works with live HuggingFace datasets from SMOLTRACE evaluations
  5. Test Results Analysis: Deep-dive into individual test cases with analyze_results tool

πŸ“Ή Demo Materials

  1. 🎬 Quick Demo (5 min): Watch on Loom
  2. πŸ“Ί Full Demo (20 min): Watch on Loom

Documentation

For quick evaluation:

  • Read this README for overview
  • Visit the Live Demo to test tools
  • Use the Auto-Config link to connect your MCP client

For deep dives:

  • DOCUMENTATION.md - Complete API reference
    • Tool descriptions and parameters
    • Resource URIs and schemas
    • Prompt template details
    • Example use cases
  • ARCHITECTURE.md - Technical architecture
    • Project structure
    • MCP protocol implementation
    • Gemini integration details
    • Deployment guide

Technology Stack

  • AI Model: Google Gemini 2.5 Flash (via Google AI SDK)
  • MCP Framework: Gradio 6 with native MCP support (@gr.mcp.* decorators)
  • Data Source: HuggingFace Datasets API
  • Transport: SSE (recommended) + Streamable HTTP
  • Deployment: HuggingFace Spaces (Docker SDK)

Run Locally (Optional)

# Clone and setup
git clone https://github.com/Mandark-droid/TraceMind-mcp-server.git
cd TraceMind-mcp-server
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Configure API keys
cp .env.example .env
# Edit .env with your GEMINI_API_KEY and HF_TOKEN

# Run the server
python app.py

Visit http://localhost:7860 to test the tools via Gradio UI.


Related Projects

🧠 TraceMind-AI (Track 2 - MCP in Action):

πŸ“Š Foundation Libraries:


Credits

Built for: MCP's 1st Birthday Hackathon (Nov 14-30, 2025) Track: Building MCP (Enterprise) Author: Kshitij Thakkar Powered by: Google Gemini 2.5 Flash Built with: Gradio (native MCP support)

Sponsors: HuggingFace β€’ Google Gemini β€’ Modal β€’ Anthropic β€’ Gradio β€’ OpenAI β€’ Nebius β€’ Hyperbolic β€’ ElevenLabs β€’ SambaNova β€’ Blaxel


License

AGPL-3.0 - See LICENSE for details


Support

  • πŸ“§ GitHub Issues: TraceMind-mcp-server/issues
  • πŸ’¬ HF Discord: #mcp-1st-birthday-officialπŸ†
  • 🏷️ Tag: building-mcp-track-enterprise