Spaces:

MCP-1st-Birthday
/

TraceMind-mcp-server

Running

File size: 28,558 Bytes

6982f0b

# TraceMind MCP Server - Technical Architecture

This document provides a deep technical dive into the TraceMind MCP Server architecture, implementation details, and deployment configuration.

## Table of Contents

- [System Overview](#system-overview)
- [Project Structure](#project-structure)
- [Core Components](#core-components)
- [MCP Protocol Implementation](#mcp-protocol-implementation)
- [Gemini Integration](#gemini-integration)
- [Data Flow](#data-flow)
- [Deployment Architecture](#deployment-architecture)
- [Development Workflow](#development-workflow)
- [Performance Considerations](#performance-considerations)
- [Security](#security)

---

## System Overview

TraceMind MCP Server is a Gradio-based MCP (Model Context Protocol) server that provides AI-powered analysis tools for agent evaluation data. It serves as the backend intelligence layer for the TraceMind ecosystem.

### Technology Stack

| Component | Technology | Version | Purpose |
|-----------|-----------|---------|---------|
| **Framework** | Gradio | 6.x | Native MCP support with `@gr.mcp.*` decorators |
| **AI Model** | Google Gemini | 2.5 Flash Lite | AI-powered analysis and insights |
| **Data Source** | HuggingFace Datasets | Latest | Load evaluation datasets |
| **Protocol** | MCP | 1.0 | Model Context Protocol for tool exposure |
| **Transport** | SSE | - | Server-Sent Events for real-time communication |
| **Deployment** | Docker | - | HuggingFace Spaces containerized deployment |
| **Language** | Python | 3.10+ | Core implementation |

### Architecture Diagram

```
┌──────────────────────────────────────────────────────────────┐
│ MCP Clients (External)                                        │
│  - Claude Desktop                                             │
│  - VS Code (Continue, Cursor, Cline)                         │
│  - TraceMind-AI (Track 2)                                    │
└────────────────┬─────────────────────────────────────────────┘
                 │
                 │ MCP Protocol
                 │ (SSE Transport)
                 ↓
┌──────────────────────────────────────────────────────────────┐
│ TraceMind MCP Server (HuggingFace Spaces)                    │
│                                                               │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Gradio App (app.py)                                   │   │
│  │  - MCP Server Endpoint (mcp_server=True)             │   │
│  │  - Testing UI (Gradio Blocks)                        │   │
│  │  - Configuration Management                           │   │
│  └─────────────┬────────────────────────────────────────┘   │
│                │                                              │
│                ↓                                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ MCP Tools (mcp_tools.py)                             │   │
│  │  - 11 Tools (@gr.mcp.tool())                         │   │
│  │  - 3 Resources (@gr.mcp.resource())                  │   │
│  │  - 3 Prompts (@gr.mcp.prompt())                      │   │
│  └─────────────┬────────────────────────────────────────┘   │
│                │                                              │
│                ↓                                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Gemini Client (gemini_client.py)                     │   │
│  │  - API Authentication                                 │   │
│  │  - Prompt Engineering                                 │   │
│  │  - Response Parsing                                   │   │
│  └─────────────┬────────────────────────────────────────┘   │
│                │                                              │
└────────────────┼──────────────────────────────────────────────┘
                 │
                 ↓
        ┌────────────────┐
        │ External APIs  │
        │  - Gemini API  │
        │  - HF Datasets │
        └────────────────┘
```

---

## Project Structure

```
TraceMind-mcp-server/
├── app.py                      # Main entry point, Gradio UI
├── mcp_tools.py                # MCP tool implementations (11 tools + 3 resources + 3 prompts)
├── gemini_client.py            # Google Gemini API client
├── requirements.txt            # Python dependencies
├── Dockerfile                  # Container configuration
├── .env.example                # Environment variable template
├── .gitignore                  # Git ignore rules
├── README.md                   # Project documentation
└── DOCUMENTATION.md            # Complete API reference

Total: 8 files (excluding docs)
Lines of Code: ~3,500 lines (breakdown below)
```

### File Sizes

| File | Lines | Purpose |
|------|-------|---------|
| `app.py` | ~1,200 | Gradio UI + MCP server setup + testing interface |
| `mcp_tools.py` | ~2,100 | All 17 MCP components (tools, resources, prompts) |
| `gemini_client.py` | ~200 | Gemini API integration |
| `requirements.txt` | ~20 | Dependencies |
| `Dockerfile` | ~30 | Deployment configuration |

---

## Core Components

### 1. app.py - Main Application

**Purpose**: Entry point for HuggingFace Spaces deployment, provides both MCP server and testing UI.

**Key Responsibilities**:
- Initialize Gradio app with `mcp_server=True`
- Create testing interface for all MCP tools
- Handle configuration (API keys, settings)
- Manage client connections

**Architecture**:

```python
# app.py structure
import gradio as gr
from gemini_client import GeminiClient
from mcp_tools import *  # All tool implementations

# 1. Initialize Gemini client (with fallback)
default_gemini_client = GeminiClient()

# 2. Create Gradio UI for testing
def create_gradio_ui():
    with gr.Blocks() as demo:
        # Settings tab for API key configuration
        # Tab for each MCP tool (11 tabs)
        # Tab for testing resources
        # Tab for testing prompts
        # API documentation tab
    return demo

# 3. Launch with MCP server enabled
if __name__ == "__main__":
    demo = create_gradio_ui()
    demo.launch(
        mcp_server=True,  # ← Enables MCP endpoint
        share=False,
        server_name="0.0.0.0",
        server_port=7860
    )
```

**MCP Enablement**:
- `mcp_server=True` in `demo.launch()` automatically:
  - Exposes `/gradio_api/mcp/sse` endpoint
  - Discovers all `@gr.mcp.tool()`, `@gr.mcp.resource()`, `@gr.mcp.prompt()` decorated functions
  - Generates MCP tool schemas from function signatures and docstrings
  - Handles MCP protocol communication (SSE transport)

**Testing Interface**:
- **Settings Tab**: Configure Gemini API key and HF token
- **Tool Tabs** (11): One tab per tool for manual testing
  - Input fields for all parameters
  - Submit button
  - Output display (Markdown or JSON)
- **Resources Tab**: Test resource URIs
- **Prompts Tab**: Test prompt templates
- **API Documentation Tab**: Generated from tool docstrings

---

### 2. mcp_tools.py - MCP Components

**Purpose**: Implements all 17 MCP components (11 tools + 3 resources + 3 prompts).

**Structure**:

```python
# mcp_tools.py structure
import gradio as gr
from gemini_client import GeminiClient
from datasets import load_dataset

# ============ TOOLS (11) ============

@gr.mcp.tool()
async def analyze_leaderboard(...) -> str:
    """Tool docstring (becomes MCP description)"""
    # 1. Load data from HuggingFace
    # 2. Process/filter data
    # 3. Call Gemini for AI analysis
    # 4. Return formatted response
    pass

@gr.mcp.tool()
async def debug_trace(...) -> str:
    """Debug traces with AI assistance"""
    pass

# ... (9 more tools)

# ============ RESOURCES (3) ============

@gr.mcp.resource()
def get_leaderboard_data(uri: str) -> str:
    """URI: leaderboard://{repo}"""
    # Parse URI
    # Load dataset
    # Return raw JSON
    pass

@gr.mcp.resource()
def get_trace_data(uri: str) -> str:
    """URI: trace://{trace_id}/{repo}"""
    pass

@gr.mcp.resource()
def get_cost_data(uri: str) -> str:
    """URI: cost://model/{model_name}"""
    pass

# ============ PROMPTS (3) ============

@gr.mcp.prompt()
def analysis_prompt(analysis_type: str, ...) -> str:
    """Generate analysis prompt templates"""
    pass

@gr.mcp.prompt()
def debug_prompt(debug_type: str, ...) -> str:
    """Generate debug prompt templates"""
    pass

@gr.mcp.prompt()
def optimization_prompt(optimization_goal: str, ...) -> str:
    """Generate optimization prompt templates"""
    pass
```

**Design Patterns**:

1. **Decorator-Based Registration**:
   ```python
   @gr.mcp.tool()  # Gradio automatically registers as MCP tool
   async def tool_name(...) -> str:
       """Docstring becomes tool description in MCP schema"""
       pass
   ```

2. **Structured Docstrings**:
   ```python
   """
   Brief one-line description.

   Longer detailed description explaining purpose and behavior.

   Args:
       param1 (type): Description of param1
       param2 (type): Description of param2. Default: value

   Returns:
       type: Description of return value
   """
   ```
   Gradio parses this to generate MCP tool schema automatically.

3. **Error Handling**:
   ```python
   try:
       # Tool implementation
       return result
   except Exception as e:
       return f"❌ **Error**: {str(e)}"
   ```
   All errors returned as user-friendly strings.

4. **Async/Await**:
   All tools are `async` for efficient I/O operations (API calls, dataset loading).

---

### 3. gemini_client.py - AI Integration

**Purpose**: Handles all interactions with Google Gemini 2.5 Flash Lite API.

**Key Features**:
- API authentication
- Prompt engineering for different analysis types
- Response parsing and formatting
- Error handling and retries
- Token optimization

**Class Structure**:

```python
class GeminiClient:
    def __init__(self, api_key: str, model_name: str):
        """Initialize with API key and model"""
        self.api_key = api_key
        self.model = genai.GenerativeModel(model_name)
        self.generation_config = {
            "temperature": 0.7,
            "top_p": 0.95,
            "max_output_tokens": 4096,  # Optimized for HF Spaces
        }
        self.request_timeout = 30  # 30s timeout

    async def analyze_with_context(
        self,
        data: Dict,
        analysis_type: str,
        specific_question: Optional[str] = None
    ) -> str:
        """
        Core analysis method used by all AI-powered tools

        Args:
            data: Data to analyze (dict or JSON)
            analysis_type: "leaderboard", "trace", "cost_estimate", "comparison", "results"
            specific_question: Optional specific question

        Returns:
            Markdown-formatted analysis
        """
        # 1. Build system prompt based on analysis_type
        system_prompt = self._get_system_prompt(analysis_type)

        # 2. Format data for context
        data_str = json.dumps(data, indent=2)

        # 3. Build user prompt
        user_prompt = f"{system_prompt}\n\nData:\n{data_str}"
        if specific_question:
            user_prompt += f"\n\nSpecific Question: {specific_question}"

        # 4. Call Gemini API
        response = await self.model.generate_content_async(
            user_prompt,
            generation_config=self.generation_config,
            request_options={"timeout": self.request_timeout}
        )

        # 5. Extract and return text
        return response.text

    def _get_system_prompt(self, analysis_type: str) -> str:
        """Get specialized system prompt for each analysis type"""
        prompts = {
            "leaderboard": """You are an expert AI agent performance analyst.
                Analyze evaluation leaderboard data and provide:
                - Top performers by key metrics
                - Trade-off analysis (cost vs accuracy)
                - Trend identification
                - Actionable recommendations
                Format: Markdown with clear sections.""",

            "trace": """You are an expert at debugging AI agent executions.
                Analyze OpenTelemetry trace data and:
                - Answer specific questions about execution
                - Identify performance bottlenecks
                - Explain reasoning chain
                - Provide optimization suggestions
                Format: Clear, concise explanation.""",

            "cost_estimate": """You are a cost optimization expert.
                Analyze cost estimation data and provide:
                - Detailed cost breakdown
                - Hardware recommendations
                - Cost optimization opportunities
                - ROI analysis
                Format: Structured breakdown with recommendations.""",

            # ... more prompts for other analysis types
        }
        return prompts.get(analysis_type, prompts["leaderboard"])
```

**Optimization Strategies**:
- **Token Reduction**: `max_output_tokens: 4096` (reduced from 8192) for faster responses
- **Request Timeout**: 30s timeout for HF Spaces compatibility
- **Temperature**: 0.7 for balanced creativity and consistency
- **Model Selection**: `gemini-2.5-flash-lite` for speed (can switch to `gemini-2.5-flash` for quality)

---

## MCP Protocol Implementation

### How Gradio's Native MCP Support Works

Gradio 6+ provides native MCP server capabilities through decorators and automatic schema generation.

**1. Tool Registration**:
```python
@gr.mcp.tool()  # ← This decorator tells Gradio to expose this as an MCP tool
async def my_tool(param1: str, param2: int = 10) -> str:
    """
    Brief description (used in MCP tool schema).

    Args:
        param1 (str): Description of param1
        param2 (int): Description of param2. Default: 10

    Returns:
        str: Description of return value
    """
    return f"Result: {param1}, {param2}"
```

**What Gradio does automatically**:
- Parses function signature to extract parameter names and types
- Parses docstring to extract descriptions
- Generates MCP tool schema:
  ```json
  {
    "name": "my_tool",
    "description": "Brief description (used in MCP tool schema).",
    "inputSchema": {
      "type": "object",
      "properties": {
        "param1": {
          "type": "string",
          "description": "Description of param1"
        },
        "param2": {
          "type": "integer",
          "default": 10,
          "description": "Description of param2. Default: 10"
        }
      },
      "required": ["param1"]
    }
  }
  ```

**2. Resource Registration**:
```python
@gr.mcp.resource()
def get_resource(uri: str) -> str:
    """
    Resource description.

    Args:
        uri (str): Resource URI (e.g., "leaderboard://repo/name")

    Returns:
        str: JSON data
    """
    # Parse URI
    # Load data
    # Return JSON string
    pass
```

**3. Prompt Registration**:
```python
@gr.mcp.prompt()
def generate_prompt(prompt_type: str, context: str) -> str:
    """
    Generate reusable prompt templates.

    Args:
        prompt_type (str): Type of prompt
        context (str): Context for prompt generation

    Returns:
        str: Generated prompt text
    """
    return f"Prompt template for {prompt_type} with {context}"
```

### MCP Endpoint URLs

When `demo.launch(mcp_server=True)` is called:

**SSE Endpoint** (Primary):
```
https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse
```

**Streamable HTTP Endpoint** (Alternative):
```
https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/
```

### Client Configuration

**Claude Desktop** (`claude_desktop_config.json`):
```json
{
  "mcpServers": {
    "tracemind": {
      "url": "https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse",
      "transport": "sse"
    }
  }
}
```

**Python MCP Client**:
```python
from mcp import ClientSession, ServerParameters

session = ClientSession(
    ServerParameters(
        url="https://mcp-1st-birthday-tracemind-mcp-server.hf.space/gradio_api/mcp/sse",
        transport="sse"
    )
)
await session.__aenter__()

# List tools
tools = await session.list_tools()

# Call tool
result = await session.call_tool("analyze_leaderboard", arguments={
    "metric_focus": "cost",
    "top_n": 5
})
```

---

## Gemini Integration

### API Configuration

**Environment Variable**:
```bash
GEMINI_API_KEY=your_api_key_here
```

**Initialization**:
```python
import google.generativeai as genai

genai.configure(api_key=os.getenv("GEMINI_API_KEY"))
model = genai.GenerativeModel("gemini-2.5-flash-lite")
```

### Prompt Engineering Strategy

**1. System Prompts by Analysis Type**:
Each analysis type (leaderboard, trace, cost, comparison, results) has a specialized system prompt that:
- Defines the AI's role and expertise
- Specifies output format (markdown, structured sections)
- Lists key insights to include
- Sets tone (professional, concise, actionable)

**2. Context Injection**:
```python
user_prompt = f"""
{system_prompt}

Data to Analyze:
{json.dumps(data, indent=2)}

Specific Question: {question}
"""
```

**3. Output Formatting**:
- All responses in Markdown
- Clear sections: Top Performers, Key Insights, Trade-offs, Recommendations
- Bullet points for readability
- Code blocks for technical details

### Rate Limiting & Error Handling

**Rate Limits** (Gemini 2.5 Flash Lite free tier):
- 1,500 requests per day
- 1 request per second

**Error Handling Strategy**:
```python
try:
    response = await model.generate_content_async(...)
    return response.text
except google.api_core.exceptions.ResourceExhausted:
    return "❌ **Rate limit exceeded**. Please try again in a few seconds."
except google.api_core.exceptions.DeadlineExceeded:
    return "❌ **Request timeout**. The analysis is taking too long. Try with less data."
except Exception as e:
    return f"❌ **Error**: {str(e)}"
```

---

## Data Flow

### Tool Execution Flow

```
1. MCP Client                    (e.g., Claude Desktop, TraceMind-AI)
   └─→ Calls: analyze_leaderboard(metric_focus="cost", top_n=5)

2. Gradio MCP Server             (app.py)
   └─→ Routes to: analyze_leaderboard() in mcp_tools.py

3. MCP Tool Function             (mcp_tools.py)
   ├─→ Load data from HuggingFace Datasets
   │   └─→ ds = load_dataset("kshitijthakkar/smoltrace-leaderboard")
   │
   ├─→ Process/filter data
   │   └─→ Filter by time range, sort by metric
   │
   ├─→ Call Gemini Client
   │   └─→ gemini_client.analyze_with_context(data, "leaderboard")
   │
   └─→ Return formatted response

4. Gemini Client                 (gemini_client.py)
   ├─→ Build system prompt
   ├─→ Format data as JSON
   ├─→ Call Gemini API
   │   └─→ model.generate_content_async(prompt)
   └─→ Return AI-generated analysis

5. Response Path                 (back through stack)
   └─→ Gemini → gemini_client → mcp_tool → Gradio → MCP Client

6. MCP Client                    (displays result to user)
   └─→ Shows markdown-formatted analysis
```

### Resource Access Flow

```
1. MCP Client
   └─→ Accesses: leaderboard://kshitijthakkar/smoltrace-leaderboard

2. Gradio MCP Server
   └─→ Routes to: get_leaderboard_data(uri)

3. Resource Function
   ├─→ Parse URI to extract repo name
   ├─→ Load dataset from HuggingFace
   ├─→ Convert to JSON
   └─→ Return raw JSON string

4. MCP Client
   └─→ Receives raw JSON data (no AI processing)
```

---

## Deployment Architecture

### HuggingFace Spaces Deployment

**Platform**: HuggingFace Spaces
**SDK**: Docker (for custom dependencies)
**Hardware**: CPU Basic (free tier) - sufficient for API calls and dataset loading
**URL**: https://huggingface.co/spaces/MCP-1st-Birthday/TraceMind-mcp-server

### Dockerfile

```dockerfile
# Base image
FROM python:3.10-slim

# Set working directory
WORKDIR /app

# Copy requirements
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy application files
COPY app.py .
COPY mcp_tools.py .
COPY gemini_client.py .

# Expose port
EXPOSE 7860

# Set environment variables
ENV GRADIO_SERVER_NAME="0.0.0.0"
ENV GRADIO_SERVER_PORT="7860"

# Run application
CMD ["python", "app.py"]
```

### Environment Variables (HF Spaces Secrets)

```bash
# Required
GEMINI_API_KEY=your_gemini_api_key_here

# Optional (for testing)
HF_TOKEN=your_huggingface_token_here
```

### Scaling Considerations

**Current Setup** (Free Tier):
- Hardware: CPU Basic
- Concurrent Users: ~10-20
- Request Latency: 2-5 seconds (AI analysis)
- Rate Limit: Gemini API (1,500 req/day)

**If Scaling Needed**:
1. **Upgrade Hardware**: CPU Basic → CPU Upgrade (2x performance)
2. **Caching**: Add Redis for caching frequent queries
3. **API Key Pool**: Rotate multiple Gemini API keys to bypass rate limits
4. **Load Balancing**: Deploy multiple Spaces instances with load balancer

---

## Development Workflow

### Local Development Setup

```bash
# 1. Clone repository
git clone https://github.com/Mandark-droid/TraceMind-mcp-server.git
cd TraceMind-mcp-server

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment
cp .env.example .env
# Edit .env with your API keys

# 5. Run locally
python app.py

# 6. Access
# - Gradio UI: http://localhost:7860
# - MCP Endpoint: http://localhost:7860/gradio_api/mcp/sse
```

### Testing MCP Tools

**Option 1: Gradio UI** (Easiest):
```
1. Run app.py
2. Open http://localhost:7860
3. Navigate to tool tab (e.g., "📊 Analyze Leaderboard")
4. Fill in parameters
5. Click submit button
6. View results
```

**Option 2: Python MCP Client**:
```python
from mcp import ClientSession, ServerParameters

async def test_tool():
    session = ClientSession(
        ServerParameters(
            url="http://localhost:7860/gradio_api/mcp/sse",
            transport="sse"
        )
    )
    await session.__aenter__()

    result = await session.call_tool("analyze_leaderboard", {
        "metric_focus": "cost",
        "top_n": 3
    })

    print(result.content[0].text)

import asyncio
asyncio.run(test_tool())
```

### Adding New MCP Tools

**Step 1: Add function to mcp_tools.py**:
```python
@gr.mcp.tool()
async def new_tool_name(
    param1: str,
    param2: int = 10
) -> str:
    """
    Brief description of what this tool does.

    Detailed explanation of the tool's purpose and behavior.

    Args:
        param1 (str): Description of param1 with examples
        param2 (int): Description of param2. Default: 10

    Returns:
        str: Description of what the function returns
    """
    try:
        # Implementation
        result = f"Processed: {param1} with {param2}"
        return result
    except Exception as e:
        return f"❌ **Error**: {str(e)}"
```

**Step 2: Add testing UI to app.py** (optional):
```python
with gr.Tab("🆕 New Tool"):
    gr.Markdown("## New Tool Name")
    param1_input = gr.Textbox(label="Param 1")
    param2_input = gr.Number(label="Param 2", value=10)
    submit_btn = gr.Button("Execute")
    output = gr.Markdown()

    submit_btn.click(
        fn=new_tool_name,
        inputs=[param1_input, param2_input],
        outputs=output
    )
```

**Step 3: Test**:
```bash
python app.py
# Visit http://localhost:7860
# Test in new tab
```

**Step 4: Deploy**:
```bash
git add mcp_tools.py app.py
git commit -m "feat: Add new_tool_name MCP tool"
git push origin main
# HF Spaces auto-deploys
```

---

## Performance Considerations

### 1. Token Optimization

**Problem**: Loading full datasets consumes excessive tokens in AI analysis.

**Solutions**:
- **get_top_performers**: Returns only top N models (90% token reduction)
- **get_leaderboard_summary**: Returns aggregated stats (99% token reduction)
- **Data sampling**: Limit rows when loading datasets (max_rows parameter)

**Example**:
```python
# ❌ BAD: Loads 51 rows, ~50K tokens
full_data = load_dataset("kshitijthakkar/smoltrace-leaderboard")

# ✅ GOOD: Returns top 5, ~5K tokens (90% reduction)
top_5 = await get_top_performers(top_n=5)

# ✅ BETTER: Returns summary, ~500 tokens (99% reduction)
summary = await get_leaderboard_summary()
```

### 2. Async Operations

All tools are `async` for efficient I/O:
```python
@gr.mcp.tool()
async def tool_name(...):  # ← async
    ds = load_dataset(...)  # ← Blocks on I/O
    result = await gemini_client.analyze(...)  # ← async API call
    return result
```

Benefits:
- Non-blocking API calls
- Multiple concurrent requests
- Better resource utilization

### 3. Caching (Future Enhancement)

**Current**: No caching (stateless)
**Future**: Add Redis for caching frequent queries

```python
import redis
from functools import wraps

redis_client = redis.Redis(...)

def cache_result(ttl=300):
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            # Generate cache key
            cache_key = f"{func.__name__}:{hash((args, tuple(kwargs.items())))}"

            # Check cache
            cached = redis_client.get(cache_key)
            if cached:
                return cached.decode()

            # Execute function
            result = await func(*args, **kwargs)

            # Store in cache
            redis_client.setex(cache_key, ttl, result)

            return result
        return wrapper
    return decorator

@gr.mcp.tool()
@cache_result(ttl=300)  # 5-minute cache
async def analyze_leaderboard(...):
    pass
```

---

## Security

### API Key Management

**Storage**:
- Development: `.env` file (gitignored)
- Production: HuggingFace Spaces Secrets (encrypted)

**Access**:
```python
# gemini_client.py
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
    raise ValueError("GEMINI_API_KEY not set")
```

**Never**:
- ❌ Hardcode API keys in source code
- ❌ Commit `.env` to git
- ❌ Expose keys in client-side JavaScript
- ❌ Log API keys in console/files

### Input Validation

**Dataset Repository Validation**:
```python
# Only allow "smoltrace-" prefix datasets
if "smoltrace-" not in dataset_repo:
    return "❌ Error: Dataset must contain 'smoltrace-' prefix for security"
```

**Parameter Validation**:
```python
# Constrain ranges
top_n = max(1, min(20, top_n))  # Clamp between 1-20
max_rows = max(10, min(500, max_rows))  # Clamp between 10-500
```

### Rate Limiting

**Gemini API**:
- Free tier: 1,500 requests/day
- Handled by Google (automatic)
- Errors returned as user-friendly messages

**HuggingFace Datasets**:
- No rate limits for public datasets
- Private datasets require HF token

---

## Related Documentation

- [README.md](PROPOSED_README_MCP_SERVER.md) - Overview and quick start
- [DOCUMENTATION.md](DOCUMENTATION_MCP_SERVER.md) - Complete API reference
- [TraceMind-AI Architecture](ARCHITECTURE_TRACEMIND_AI.md) - Client-side architecture

---

**Last Updated**: November 21, 2025
**Version**: 1.0.0
**Track**: Building MCP (Enterprise)