Spaces:
Sleeping
Sleeping
Lindow
commited on
Commit
·
3fac7d8
1
Parent(s):
7602502
add mcp and gradio app
Browse files- .agent/docs/common_core_spec.md +0 -136
- .agent/specs/002_mcp/spec.md +574 -0
- .agent/specs/002_mcp/tasks.md +69 -0
- .agent/specs/003_gradio/spec.md +1167 -0
- .agent/specs/003_gradio/tasks.md +76 -0
- .cursor/commands/{spec_draft.md → draft_spec.md} +56 -19
- .cursor/commands/finalize_spec.md +0 -21
- .cursor/rules/standards/code_style/readme.mdc +99 -0
- .env.example +10 -4
- app.py +419 -0
- pyproject.toml +2 -2
- requirements.txt +11 -0
- src/lookup.py +81 -0
- src/mcp_config.py +33 -0
- {tools → src}/pinecone_client.py +101 -2
- src/search.py +86 -0
- tools/cli.py +2 -2
- tools/pinecone_models.py +2 -2
.agent/docs/common_core_spec.md
DELETED
|
@@ -1,136 +0,0 @@
|
|
| 1 |
-
This is the authoritative **Common Core Data Specification**. It contains the exact source locations, data schemas, field definitions, and the specific processing logic required to interpret the hierarchy correctly.
|
| 2 |
-
|
| 3 |
-
**Use this document as the source of truth for `tools/build_data.py`.**
|
| 4 |
-
|
| 5 |
-
---
|
| 6 |
-
|
| 7 |
-
# Data Specification: Common Core Standards
|
| 8 |
-
|
| 9 |
-
**Authority:** Common Standards Project (GitHub)
|
| 10 |
-
**License:** Creative Commons Attribution 4.0 (CC BY 4.0)
|
| 11 |
-
**Format:** JSON (Flat List of Objects)
|
| 12 |
-
|
| 13 |
-
## 1. Source Locations
|
| 14 |
-
|
| 15 |
-
We are using the "Clean Data" export from the Common Standards Project. These files are static JSON dumps where each file represents a full Subject.
|
| 16 |
-
|
| 17 |
-
| Subject | Direct Download URL |
|
| 18 |
-
| :----------------- | :--------------------------------------------------------------------------------------------------------------------------- |
|
| 19 |
-
| **Mathematics** | `https://raw.githubusercontent.com/commoncurriculum/common-standards-project/master/data/clean-data/CCSSI/Mathematics.json` |
|
| 20 |
-
| **ELA / Literacy** | `https://raw.githubusercontent.com/commoncurriculum/common-standards-project/master/data/clean-data/CCSSI/ELA-Literacy.json` |
|
| 21 |
-
|
| 22 |
-
---
|
| 23 |
-
|
| 24 |
-
## 2. The Data Structure (Glossary)
|
| 25 |
-
|
| 26 |
-
The JSON file contains a root object. The actual standards are located in the `standards` dictionary, keyed by their internal GUID.
|
| 27 |
-
|
| 28 |
-
### **Root Object**
|
| 29 |
-
|
| 30 |
-
```json
|
| 31 |
-
{
|
| 32 |
-
"subject": "Mathematics",
|
| 33 |
-
"standards": {
|
| 34 |
-
"6051566A...": { ... }, // Standard Object
|
| 35 |
-
"5E367098...": { ... } // Standard Object
|
| 36 |
-
}
|
| 37 |
-
}
|
| 38 |
-
```
|
| 39 |
-
|
| 40 |
-
### **Standard Object (The Item)**
|
| 41 |
-
|
| 42 |
-
Each item represents a node in the curriculum tree. It could be a broad **Domain**, a grouping **Cluster**, or a specific **Standard**.
|
| 43 |
-
|
| 44 |
-
| Field Name | Type | Definition & Usage |
|
| 45 |
-
| :---------------------- | :-------------- | :------------------------------------------------------------------------------------------------------------------------------------ |
|
| 46 |
-
| **`id`** | `String (GUID)` | The internal unique identifier. Used for lookups in `ancestorIds`. |
|
| 47 |
-
| **`statementNotation`** | `String` | **The Display Code.** (e.g., `CCSS.Math.Content.1.OA.A.1`). This is what teachers recognize. Use this for the UI. |
|
| 48 |
-
| **`description`** | `String` | The text content. **Warning:** For standards, this text is often incomplete without its parent context (see Hierarchy below). |
|
| 49 |
-
| **`statementLabel`** | `String` | The hierarchy type. Critical values: <br>• `Domain` (Highest) <br>• `Cluster` (Grouping) <br>• `Standard` (The actionable item) |
|
| 50 |
-
| **`gradeLevels`** | `Array[String]` | Scope of the standard. <br>• Format: `["01", "02"]` (Grades 1 & 2), `["K"]` (Kindergarten), `["09", "10", "11", "12"]` (High School). |
|
| 51 |
-
| **`ancestorIds`** | `Array[GUID]` | **CRITICAL.** An ordered list of parent IDs (from root to immediate parent). You must resolve these to build the full context. |
|
| 52 |
-
|
| 53 |
-
---
|
| 54 |
-
|
| 55 |
-
## 3. Hierarchy & Context (The "Interpretation" Problem)
|
| 56 |
-
|
| 57 |
-
**The Problem:**
|
| 58 |
-
A standard's description often relies on its parent "Cluster" for meaning.
|
| 59 |
-
|
| 60 |
-
- _Cluster Text:_ "Understand the place value system."
|
| 61 |
-
- _Standard Text:_ "Recognize that in a multi-digit number, a digit in one place represents 10 times as much..."
|
| 62 |
-
|
| 63 |
-
If you only embed the _Standard Text_, the vector will miss the concept of "Place Value."
|
| 64 |
-
|
| 65 |
-
**The Solution (Processing Logic):**
|
| 66 |
-
To generate the **Search String** for embedding, you must concatenate the hierarchy.
|
| 67 |
-
|
| 68 |
-
1. **Domain:** The broad category (e.g., "Number and Operations in Base Ten").
|
| 69 |
-
2. **Cluster:** The specific topic (e.g., "Generalize place value understanding").
|
| 70 |
-
3. **Standard:** The task.
|
| 71 |
-
|
| 72 |
-
**Formula:**
|
| 73 |
-
|
| 74 |
-
```text
|
| 75 |
-
"{Subject} {Grade}: {Domain Text} - {Cluster Text} - {Standard Text}"
|
| 76 |
-
```
|
| 77 |
-
|
| 78 |
-
---
|
| 79 |
-
|
| 80 |
-
## 4. Build Pipeline Specification (`tools/build_data.py`)
|
| 81 |
-
|
| 82 |
-
This specific logic ensures we extract meaningful vectors.
|
| 83 |
-
|
| 84 |
-
### **Step A: Ingestion**
|
| 85 |
-
|
| 86 |
-
1. Download both JSON files.
|
| 87 |
-
2. Merge the `standards` dictionaries into a single **Lookup Map** (Memory: `Map<GUID, Object>`).
|
| 88 |
-
|
| 89 |
-
### **Step B: Iteration & Filtering**
|
| 90 |
-
|
| 91 |
-
Iterate through the Lookup Map.
|
| 92 |
-
**Filter Rule:**
|
| 93 |
-
|
| 94 |
-
- **KEEP** if `statementLabel` equals `"Standard"`.
|
| 95 |
-
- **DISCARD** if `statementLabel` is `"Domain"`, `"Cluster"`, or `"Component"`. (We only index the actionable leaves).
|
| 96 |
-
|
| 97 |
-
### **Step C: Context Resolution (The "Breadcrumb" Loop)**
|
| 98 |
-
|
| 99 |
-
For every kept Standard:
|
| 100 |
-
|
| 101 |
-
1. Initialize `context_text = ""`
|
| 102 |
-
2. Iterate through `ancestorIds`:
|
| 103 |
-
- Use the ID to look up the Parent Object in the **Lookup Map**.
|
| 104 |
-
- Append `Parent.description` to `context_text`.
|
| 105 |
-
3. Construct the final string:
|
| 106 |
-
- `full_text = f"{context_text} {current_standard.description}"`
|
| 107 |
-
4. **Vectorize `full_text`**.
|
| 108 |
-
|
| 109 |
-
### **Step D: Output Schema (`data/standards.json`)**
|
| 110 |
-
|
| 111 |
-
The clean, flat JSON file you save for the App to load must look like this:
|
| 112 |
-
|
| 113 |
-
```json
|
| 114 |
-
[
|
| 115 |
-
{
|
| 116 |
-
"id": "CCSS.Math.Content.1.OA.A.1", // From 'statementNotation'
|
| 117 |
-
"guid": "6051566A...", // From 'id'
|
| 118 |
-
"grade": "01", // From 'gradeLevels[0]'
|
| 119 |
-
"subject": "Mathematics", // From 'subject'
|
| 120 |
-
"description": "Use addition and subtraction within 20 to solve word problems...", // From 'description'
|
| 121 |
-
"full_context": "Operations and Algebraic Thinking - Represent and solve problems... - Use addition and..." // The text we used for embedding
|
| 122 |
-
}
|
| 123 |
-
]
|
| 124 |
-
```
|
| 125 |
-
|
| 126 |
-
---
|
| 127 |
-
|
| 128 |
-
## 5. Summary of Valid `gradeLevels`
|
| 129 |
-
|
| 130 |
-
When processing, normalize these strings if necessary, but typically they appear as:
|
| 131 |
-
|
| 132 |
-
- `K` (Kindergarten)
|
| 133 |
-
- `01` - `08` (Grades 1-8)
|
| 134 |
-
- `09-12` (High School generic)
|
| 135 |
-
|
| 136 |
-
_Note: If `gradeLevels` is an array `["09", "10", "11", "12"]`, you can display it as "High School" or "Grades 9-12"._
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.agent/specs/002_mcp/spec.md
ADDED
|
@@ -0,0 +1,574 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# MCP Server Sprint Specification
|
| 2 |
+
|
| 3 |
+
## Overview
|
| 4 |
+
|
| 5 |
+
This sprint builds the MCP (Model Context Protocol) server that exposes the Pinecone database of educational standards to MCP clients (e.g., Claude Desktop). The server provides two methods of interacting with the Pinecone database:
|
| 6 |
+
|
| 7 |
+
1. **Semantic Search**: Vector similarity search using natural language queries to find relevant standards
|
| 8 |
+
2. **Direct ID Lookup**: Retrieve a specific standard by its GUID or identifier
|
| 9 |
+
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
+
## MCP Server Architecture
|
| 13 |
+
|
| 14 |
+
### Entry Point
|
| 15 |
+
|
| 16 |
+
The MCP server will be implemented as `server.py` in the project root. This file is minimal and focused on:
|
| 17 |
+
|
| 18 |
+
- Setting up the FastMCP server instance
|
| 19 |
+
- Defining tool decorators that delegate to logic in `src/`
|
| 20 |
+
- Running the server
|
| 21 |
+
|
| 22 |
+
The bulk of the logic lives in the `src/` directory, which serves as the authoritative location for shared code.
|
| 23 |
+
|
| 24 |
+
### Framework
|
| 25 |
+
|
| 26 |
+
Use `mcp.server.fastmcp.FastMCP` for the MCP server implementation. This provides a simple decorator-based API for defining tools.
|
| 27 |
+
|
| 28 |
+
### Code Organization
|
| 29 |
+
|
| 30 |
+
**Separation of Concerns:**
|
| 31 |
+
|
| 32 |
+
- `server.py` (root): Server setup, tool definitions, and delegation to `src/` modules
|
| 33 |
+
- `src/`: Authoritative location for all business logic, including Pinecone client and tool implementations
|
| 34 |
+
- `tools/`: CLI tools that import shared logic from `src/`
|
| 35 |
+
|
| 36 |
+
### Pinecone Client Migration
|
| 37 |
+
|
| 38 |
+
Move `tools/pinecone_client.py` to `src/pinecone_client.py`. Update all imports in `tools/` to import from `src.pinecone_client` instead. This establishes `src/` as the definitive source for Pinecone interaction logic.
|
| 39 |
+
|
| 40 |
+
### Configuration
|
| 41 |
+
|
| 42 |
+
Create a configuration module `src/mcp_config.py` that wraps or duplicates Pinecone configuration settings. This provides isolation from the CLI tools configuration.
|
| 43 |
+
|
| 44 |
+
**Configuration Requirements:**
|
| 45 |
+
|
| 46 |
+
- `PINECONE_API_KEY`: Pinecone API key (required)
|
| 47 |
+
- `PINECONE_INDEX_NAME`: Name of the Pinecone index (default: `common-core-standards`)
|
| 48 |
+
- `PINECONE_NAMESPACE`: Namespace for records (default: `standards`)
|
| 49 |
+
|
| 50 |
+
---
|
| 51 |
+
|
| 52 |
+
## MCP Tools
|
| 53 |
+
|
| 54 |
+
The server exposes two tools matching the MVP spec interface:
|
| 55 |
+
|
| 56 |
+
### Tool 1: `find_relevant_standards`
|
| 57 |
+
|
| 58 |
+
Performs semantic search over educational standards using vector similarity.
|
| 59 |
+
|
| 60 |
+
**Parameters:**
|
| 61 |
+
|
| 62 |
+
- `activity` (str, required): Natural language description of the learning activity
|
| 63 |
+
- `max_results` (int, optional, default: 5): Maximum number of standards to return
|
| 64 |
+
- `grade` (str, optional): Grade level filter (e.g., "K", "01", "05", "09")
|
| 65 |
+
- `subject` (str, optional): Subject filter (e.g., "Mathematics", "ELA-Literacy")
|
| 66 |
+
|
| 67 |
+
**Implementation:**
|
| 68 |
+
|
| 69 |
+
- Use Pinecone's `search()` method with integrated embeddings (`llama-text-embed-v2`)
|
| 70 |
+
- Pass the user's query text via `query={"inputs": {"text": activity}, "top_k": max_results}`
|
| 71 |
+
- Embeddings are generated automatically by Pinecone
|
| 72 |
+
- Apply metadata filters inside the query dict: `"filter": {...}` (only include key if filters exist)
|
| 73 |
+
- Always rerank results using `rerank={"model": "bge-reranker-v2-m3", "top_n": max_results, "rank_fields": ["content"]}`
|
| 74 |
+
- Access results via `results['result']['hits']`, extracting `_id`, `_score`, and `fields` from each hit
|
| 75 |
+
- Return top N matches sorted by reranked relevance score
|
| 76 |
+
|
| 77 |
+
**Response Format:**
|
| 78 |
+
Returns a JSON string with structured format:
|
| 79 |
+
|
| 80 |
+
```json
|
| 81 |
+
{
|
| 82 |
+
"success": true,
|
| 83 |
+
"results": [
|
| 84 |
+
{
|
| 85 |
+
"_id": "EA60C8D165F6481B90BFF782CE193F93",
|
| 86 |
+
"content": "...",
|
| 87 |
+
"subject": "Mathematics",
|
| 88 |
+
"education_levels": ["01"],
|
| 89 |
+
"statement_notation": "CCSS.Math.Content.1.OA.A.1",
|
| 90 |
+
"standard_set_title": "...",
|
| 91 |
+
"score": 0.85
|
| 92 |
+
}
|
| 93 |
+
],
|
| 94 |
+
"message": "Found 5 matching standards"
|
| 95 |
+
}
|
| 96 |
+
```
|
| 97 |
+
|
| 98 |
+
### Tool 2: `get_standard_details`
|
| 99 |
+
|
| 100 |
+
Retrieves a specific standard by its GUID or identifier.
|
| 101 |
+
|
| 102 |
+
**Parameters:**
|
| 103 |
+
|
| 104 |
+
- `standard_id` (str, required): The standard's GUID (`_id` field) or identifier
|
| 105 |
+
|
| 106 |
+
**Implementation:**
|
| 107 |
+
|
| 108 |
+
- Use Pinecone's `fetch()` method with the standard's GUID (`_id` field)
|
| 109 |
+
- If the identifier is not a GUID, may need to query by metadata filter on `statement_notation` or `asn_identifier` fields
|
| 110 |
+
|
| 111 |
+
**Response Format:**
|
| 112 |
+
Returns a JSON string with structured format:
|
| 113 |
+
|
| 114 |
+
```json
|
| 115 |
+
{
|
| 116 |
+
"success": true,
|
| 117 |
+
"results": [
|
| 118 |
+
{
|
| 119 |
+
"_id": "EA60C8D165F6481B90BFF782CE193F93",
|
| 120 |
+
"content": "...",
|
| 121 |
+
"subject": "Mathematics",
|
| 122 |
+
"education_levels": ["01"],
|
| 123 |
+
"statement_notation": "CCSS.Math.Content.1.OA.A.1",
|
| 124 |
+
"standard_set_title": "...",
|
| 125 |
+
"all_metadata_fields": "..."
|
| 126 |
+
}
|
| 127 |
+
],
|
| 128 |
+
"message": "Retrieved standard details"
|
| 129 |
+
}
|
| 130 |
+
```
|
| 131 |
+
|
| 132 |
+
---
|
| 133 |
+
|
| 134 |
+
## Error Handling
|
| 135 |
+
|
| 136 |
+
All tools catch exceptions and return structured error responses with `error_type` fields:
|
| 137 |
+
|
| 138 |
+
**Error Response Format:**
|
| 139 |
+
|
| 140 |
+
```json
|
| 141 |
+
{
|
| 142 |
+
"success": false,
|
| 143 |
+
"results": [],
|
| 144 |
+
"message": "Error description",
|
| 145 |
+
"error_type": "error_category"
|
| 146 |
+
}
|
| 147 |
+
```
|
| 148 |
+
|
| 149 |
+
**Valid `error_type` values:**
|
| 150 |
+
|
| 151 |
+
- `no_results`: Query returned no matches
|
| 152 |
+
- `invalid_input`: Malformed input (empty string, invalid ID format, etc.)
|
| 153 |
+
- `api_error`: Pinecone API failure or connection error
|
| 154 |
+
- `not_found`: Standard ID doesn't exist (for `get_standard_details`)
|
| 155 |
+
|
| 156 |
+
For `get_standard_details` with invalid ID, include a helpful suggestion:
|
| 157 |
+
|
| 158 |
+
```json
|
| 159 |
+
{
|
| 160 |
+
"success": false,
|
| 161 |
+
"results": [],
|
| 162 |
+
"message": "Standard 'XYZ.123' not found. Try using find_relevant_standards with a keyword search instead.",
|
| 163 |
+
"error_type": "not_found"
|
| 164 |
+
}
|
| 165 |
+
```
|
| 166 |
+
|
| 167 |
+
---
|
| 168 |
+
|
| 169 |
+
## Pinecone Integration
|
| 170 |
+
|
| 171 |
+
### Implementation Approach
|
| 172 |
+
|
| 173 |
+
Use the shared `PineconeClient` class from `src/pinecone_client.py`. The MCP server and CLI tools both import and use this shared client, ensuring consistency and avoiding code duplication. The `src/` directory is the authoritative location for Pinecone interaction logic.
|
| 174 |
+
|
| 175 |
+
### Extending PineconeClient
|
| 176 |
+
|
| 177 |
+
Add search and fetch methods to `src/pinecone_client.py`:
|
| 178 |
+
|
| 179 |
+
- `search_standards()`: Semantic search with filters
|
| 180 |
+
- `fetch_standard()`: Direct ID lookup
|
| 181 |
+
|
| 182 |
+
These methods encapsulate the Pinecone query logic and can be used by both the MCP server and CLI tools.
|
| 183 |
+
|
| 184 |
+
### Semantic Search Implementation
|
| 185 |
+
|
| 186 |
+
- Use Pinecone's `search()` method with integrated embeddings
|
| 187 |
+
- The index is configured with `llama-text-embed-v2` model and `field_map text=content`
|
| 188 |
+
- Pass query text to Pinecone's `search()` method using the `inputs` parameter - embeddings are generated automatically
|
| 189 |
+
- Always rerank results using the `bge-reranker-v2-m3` model for improved relevance
|
| 190 |
+
- Build filter dictionary dynamically (only include filters with values):
|
| 191 |
+
- If `grade` provided: `{"education_levels": {"$in": [grade]}}`
|
| 192 |
+
- If `subject` provided: `{"subject": {"$eq": subject}}`
|
| 193 |
+
- Combine with `$and` if both: `{"$and": [grade_filter, subject_filter]}`
|
| 194 |
+
- Add filter to query dict only if it exists: `query_dict["filter"] = filter_dict`
|
| 195 |
+
- **Important**: Do not set `filter` to `None` — omit the key entirely when no filters
|
| 196 |
+
|
| 197 |
+
### Direct ID Lookup Implementation
|
| 198 |
+
|
| 199 |
+
- Use Pinecone's `fetch()` method with the standard's GUID (`_id` field)
|
| 200 |
+
- The `_id` field corresponds to the standard's GUID from the source data
|
| 201 |
+
|
| 202 |
+
### Pinecone Record Structure
|
| 203 |
+
|
| 204 |
+
Records in Pinecone follow the `PineconeRecord` model structure:
|
| 205 |
+
|
| 206 |
+
- `_id`: Standard GUID (string)
|
| 207 |
+
- `content`: Text content for embedding (string)
|
| 208 |
+
- `subject`: Subject name (string)
|
| 209 |
+
- `education_levels`: List of grade levels (list[str])
|
| 210 |
+
- `statement_notation`: CCSS notation if available (str | None)
|
| 211 |
+
- `standard_set_id`, `standard_set_title`, `jurisdiction_id`, etc.: Additional metadata fields
|
| 212 |
+
|
| 213 |
+
---
|
| 214 |
+
|
| 215 |
+
## File Structure
|
| 216 |
+
|
| 217 |
+
```
|
| 218 |
+
common_core_mcp/
|
| 219 |
+
├── server.py # MCP server entry point - minimal setup only (NEW)
|
| 220 |
+
├── src/
|
| 221 |
+
│ ├── pinecone_client.py # Pinecone client (MOVED from tools/) (MODIFIED)
|
| 222 |
+
│ ├── mcp_config.py # MCP-specific configuration (NEW)
|
| 223 |
+
│ ├── search.py # Semantic search logic (NEW)
|
| 224 |
+
│ └── lookup.py # Direct ID lookup logic (NEW)
|
| 225 |
+
├── tools/ # CLI tools (MODIFIED - imports from src/)
|
| 226 |
+
│ └── pinecone_client.py # (REMOVED - moved to src/)
|
| 227 |
+
├── data/ # Existing data directory (unchanged)
|
| 228 |
+
└── pyproject.toml # Dependencies (mcp already included)
|
| 229 |
+
```
|
| 230 |
+
|
| 231 |
+
### Files to Create
|
| 232 |
+
|
| 233 |
+
- **`server.py`** (project root): Minimal MCP server setup and tool definitions
|
| 234 |
+
- **`src/mcp_config.py`**: MCP-specific configuration module
|
| 235 |
+
- **`src/search.py`**: Semantic search implementation logic
|
| 236 |
+
- **`src/lookup.py`**: Direct ID lookup implementation logic
|
| 237 |
+
|
| 238 |
+
### Files to Move
|
| 239 |
+
|
| 240 |
+
- **`tools/pinecone_client.py`** → **`src/pinecone_client.py`**: Move Pinecone client to authoritative `src/` location
|
| 241 |
+
|
| 242 |
+
### Files to Modify
|
| 243 |
+
|
| 244 |
+
- **`tools/cli.py`**: Update imports to use `from src.pinecone_client import PineconeClient` (replace `from tools.pinecone_client`)
|
| 245 |
+
- **`tools/pinecone_processor.py`**: Update imports to use `from src.pinecone_client import PineconeClient` (replace `from tools.pinecone_client`)
|
| 246 |
+
- **`src/pinecone_client.py`**: Add `search_standards()` and `fetch_standard()` methods for MCP server use
|
| 247 |
+
- Any other files in `tools/` that import `pinecone_client`: Update to import from `src.pinecone_client`
|
| 248 |
+
|
| 249 |
+
### Files to Reference (Existing)
|
| 250 |
+
|
| 251 |
+
- **`tools/pinecone_models.py`**: Contains `PineconeRecord` model structure for reference (may also move to `src/` in future)
|
| 252 |
+
- **`tools/config.py`**: Contains `ToolsSettings` for reference (but MCP uses separate config)
|
| 253 |
+
|
| 254 |
+
---
|
| 255 |
+
|
| 256 |
+
## Implementation Details
|
| 257 |
+
|
| 258 |
+
### Server Entry Point (`server.py`)
|
| 259 |
+
|
| 260 |
+
The `server.py` file should be minimal and focused on setup:
|
| 261 |
+
|
| 262 |
+
1. Import FastMCP and create server instance:
|
| 263 |
+
|
| 264 |
+
```python
|
| 265 |
+
from mcp.server.fastmcp import FastMCP
|
| 266 |
+
mcp = FastMCP("CommonCore")
|
| 267 |
+
```
|
| 268 |
+
|
| 269 |
+
2. Import tool logic functions from `src/` modules
|
| 270 |
+
3. Define thin wrapper functions with `@mcp.tool()` decorator that delegate to `src/` logic
|
| 271 |
+
4. Run server with `mcp.run()` when executed directly
|
| 272 |
+
|
| 273 |
+
**Example Structure:**
|
| 274 |
+
|
| 275 |
+
```python
|
| 276 |
+
from mcp.server.fastmcp import FastMCP
|
| 277 |
+
from src.search import find_relevant_standards_impl
|
| 278 |
+
from src.lookup import get_standard_details_impl
|
| 279 |
+
|
| 280 |
+
mcp = FastMCP("CommonCore")
|
| 281 |
+
|
| 282 |
+
@mcp.tool()
|
| 283 |
+
def find_relevant_standards(activity: str, max_results: int = 5, grade: str | None = None, subject: str | None = None) -> str:
|
| 284 |
+
"""Returns educational standards relevant to the activity."""
|
| 285 |
+
return find_relevant_standards_impl(activity, max_results, grade, subject)
|
| 286 |
+
|
| 287 |
+
@mcp.tool()
|
| 288 |
+
def get_standard_details(standard_id: str) -> str:
|
| 289 |
+
"""Returns full metadata for a standard by its GUID or identifier."""
|
| 290 |
+
return get_standard_details_impl(standard_id)
|
| 291 |
+
|
| 292 |
+
if __name__ == "__main__":
|
| 293 |
+
mcp.run()
|
| 294 |
+
```
|
| 295 |
+
|
| 296 |
+
**Execution:**
|
| 297 |
+
|
| 298 |
+
- Run with: `uv run server.py` or `python server.py`
|
| 299 |
+
- Server communicates via stdio (FastMCP handles transport automatically)
|
| 300 |
+
|
| 301 |
+
### Configuration Module (`src/mcp_config.py`)
|
| 302 |
+
|
| 303 |
+
Create a configuration module that:
|
| 304 |
+
|
| 305 |
+
- Loads environment variables from `.env` file
|
| 306 |
+
- Provides settings for Pinecone connection (API key, index name, namespace)
|
| 307 |
+
- Can duplicate or wrap settings from `tools/config.py` but maintains isolation
|
| 308 |
+
|
| 309 |
+
**Function Signature:**
|
| 310 |
+
|
| 311 |
+
```python
|
| 312 |
+
def get_mcp_settings() -> McpSettings:
|
| 313 |
+
"""Get MCP server configuration settings."""
|
| 314 |
+
# Returns settings object with pinecone_api_key, pinecone_index_name, pinecone_namespace
|
| 315 |
+
```
|
| 316 |
+
|
| 317 |
+
### Search Module (`src/search.py`)
|
| 318 |
+
|
| 319 |
+
Contains the implementation logic for semantic search.
|
| 320 |
+
|
| 321 |
+
**Function Signature:**
|
| 322 |
+
|
| 323 |
+
```python
|
| 324 |
+
def find_relevant_standards_impl(
|
| 325 |
+
activity: str,
|
| 326 |
+
max_results: int = 5,
|
| 327 |
+
grade: str | None = None,
|
| 328 |
+
subject: str | None = None
|
| 329 |
+
) -> str:
|
| 330 |
+
"""
|
| 331 |
+
Implementation of semantic search over educational standards.
|
| 332 |
+
|
| 333 |
+
Args:
|
| 334 |
+
activity: Description of the learning activity
|
| 335 |
+
max_results: Maximum number of standards to return (default: 5)
|
| 336 |
+
grade: Optional grade level filter (e.g., "K", "01", "05", "09")
|
| 337 |
+
subject: Optional subject filter (e.g., "Mathematics", "ELA-Literacy")
|
| 338 |
+
|
| 339 |
+
Returns:
|
| 340 |
+
JSON string with structured response containing matching standards
|
| 341 |
+
"""
|
| 342 |
+
# Uses PineconeClient from src.pinecone_client
|
| 343 |
+
# Handles error cases and returns JSON response
|
| 344 |
+
```
|
| 345 |
+
|
| 346 |
+
### Lookup Module (`src/lookup.py`)
|
| 347 |
+
|
| 348 |
+
Contains the implementation logic for direct ID lookup.
|
| 349 |
+
|
| 350 |
+
**Function Signature:**
|
| 351 |
+
|
| 352 |
+
```python
|
| 353 |
+
def get_standard_details_impl(standard_id: str) -> str:
|
| 354 |
+
"""
|
| 355 |
+
Implementation of direct standard lookup by ID.
|
| 356 |
+
|
| 357 |
+
Args:
|
| 358 |
+
standard_id: The standard's GUID (_id field) or identifier
|
| 359 |
+
|
| 360 |
+
Returns:
|
| 361 |
+
JSON string with structured response containing standard details
|
| 362 |
+
"""
|
| 363 |
+
# Uses PineconeClient from src.pinecone_client
|
| 364 |
+
# Handles error cases and returns JSON response
|
| 365 |
+
```
|
| 366 |
+
|
| 367 |
+
### PineconeClient Extensions (`src/pinecone_client.py`)
|
| 368 |
+
|
| 369 |
+
Add methods to the `PineconeClient` class (moved from `tools/`):
|
| 370 |
+
|
| 371 |
+
**New Methods:**
|
| 372 |
+
|
| 373 |
+
```python
|
| 374 |
+
def search_standards(
|
| 375 |
+
self,
|
| 376 |
+
query_text: str,
|
| 377 |
+
top_k: int = 5,
|
| 378 |
+
grade: str | None = None,
|
| 379 |
+
subject: str | None = None
|
| 380 |
+
) -> list[dict]:
|
| 381 |
+
"""
|
| 382 |
+
Perform semantic search over standards.
|
| 383 |
+
|
| 384 |
+
Args:
|
| 385 |
+
query_text: Natural language query
|
| 386 |
+
top_k: Maximum number of results
|
| 387 |
+
grade: Optional grade filter
|
| 388 |
+
subject: Optional subject filter
|
| 389 |
+
|
| 390 |
+
Returns:
|
| 391 |
+
List of result dictionaries with metadata and scores
|
| 392 |
+
"""
|
| 393 |
+
|
| 394 |
+
def fetch_standard(self, standard_id: str) -> dict | None:
|
| 395 |
+
"""
|
| 396 |
+
Fetch a standard by its GUID.
|
| 397 |
+
|
| 398 |
+
Args:
|
| 399 |
+
standard_id: Standard GUID (_id field)
|
| 400 |
+
|
| 401 |
+
Returns:
|
| 402 |
+
Standard dictionary with metadata, or None if not found
|
| 403 |
+
"""
|
| 404 |
+
```
|
| 405 |
+
|
| 406 |
+
### Pinecone Query Implementation
|
| 407 |
+
|
| 408 |
+
**Semantic Search Workflow (`src/search.py`):**
|
| 409 |
+
|
| 410 |
+
1. Import `PineconeClient` from `src.pinecone_client`
|
| 411 |
+
2. Initialize client instance (or use singleton pattern)
|
| 412 |
+
3. Call `client.search_standards()` with parameters
|
| 413 |
+
4. Format results into JSON response structure
|
| 414 |
+
5. Handle errors and return appropriate error responses
|
| 415 |
+
|
| 416 |
+
**Implementation in `PineconeClient.search_standards()` (`src/pinecone_client.py`):**
|
| 417 |
+
|
| 418 |
+
1. Build Pinecone filter dictionary from optional parameters:
|
| 419 |
+
- If `grade` provided: Add `{"education_levels": {"$in": [grade]}}`
|
| 420 |
+
- If `subject` provided: Add `{"subject": {"$eq": subject}}`
|
| 421 |
+
- Combine filters with `$and` if both provided
|
| 422 |
+
2. Build the query dictionary:
|
| 423 |
+
- `"inputs": {"text": query_text}` for text queries (embeddings generated automatically)
|
| 424 |
+
- `"top_k": top_k * 2` to get more candidates for reranking
|
| 425 |
+
- `"filter": filter_dict` only if filters are provided (omit key if no filters)
|
| 426 |
+
3. Call `index.search()` with:
|
| 427 |
+
- `namespace=namespace` (from config)
|
| 428 |
+
- `query=query_dict` (the constructed query dictionary)
|
| 429 |
+
- `rerank={"model": "bge-reranker-v2-m3", "top_n": top_k, "rank_fields": ["content"]}` (always enabled)
|
| 430 |
+
4. Access results via `results['result']['hits']`
|
| 431 |
+
5. Extract `_id`, `_score`, and `fields` from each hit and return list of result dictionaries
|
| 432 |
+
|
| 433 |
+
**Response Parsing:**
|
| 434 |
+
|
| 435 |
+
Access search results via `results['result']['hits']`. Each hit contains:
|
| 436 |
+
|
| 437 |
+
- `hit['_id']`: Record ID
|
| 438 |
+
- `hit['_score']`: Reranked relevance score
|
| 439 |
+
- `hit['fields']`: Dictionary of metadata fields (e.g., `hit['fields']['content']`, `hit['fields']['subject']`)
|
| 440 |
+
|
| 441 |
+
Example parsing:
|
| 442 |
+
|
| 443 |
+
```python
|
| 444 |
+
for hit in results['result']['hits']:
|
| 445 |
+
record = {
|
| 446 |
+
"_id": hit["_id"],
|
| 447 |
+
"score": hit["_score"],
|
| 448 |
+
**hit["fields"] # Spread all metadata fields
|
| 449 |
+
}
|
| 450 |
+
```
|
| 451 |
+
|
| 452 |
+
**Direct ID Lookup Workflow (`src/lookup.py`):**
|
| 453 |
+
|
| 454 |
+
1. Import `PineconeClient` from `src.pinecone_client`
|
| 455 |
+
2. Initialize client instance (or use singleton pattern)
|
| 456 |
+
3. Call `client.fetch_standard()` with standard_id
|
| 457 |
+
4. If found, format into JSON response structure
|
| 458 |
+
5. If not found, return error response with `error_type: "not_found"`
|
| 459 |
+
|
| 460 |
+
**Implementation in `PineconeClient.fetch_standard()` (`src/pinecone_client.py`):**
|
| 461 |
+
|
| 462 |
+
1. Call `index.fetch()` with:
|
| 463 |
+
- `ids=[standard_id]`
|
| 464 |
+
- `namespace=namespace` (from config)
|
| 465 |
+
2. Extract result from returned dictionary
|
| 466 |
+
3. Return standard dictionary with metadata, or None if not found
|
| 467 |
+
|
| 468 |
+
### Error Handling Implementation
|
| 469 |
+
|
| 470 |
+
Error handling is implemented in the `src/` modules (`src/search.py` and `src/lookup.py`):
|
| 471 |
+
|
| 472 |
+
**In `src/search.py` (`find_relevant_standards_impl`):**
|
| 473 |
+
|
| 474 |
+
1. Validate input parameters (e.g., empty strings, None values)
|
| 475 |
+
2. Wrap `PineconeClient.search_standards()` call in try/except
|
| 476 |
+
3. Catch `PineconeException` and map to appropriate `error_type`
|
| 477 |
+
4. Handle empty results case
|
| 478 |
+
5. Return structured JSON error responses
|
| 479 |
+
6. Never raise exceptions - always return JSON response
|
| 480 |
+
|
| 481 |
+
**In `src/lookup.py` (`get_standard_details_impl`):**
|
| 482 |
+
|
| 483 |
+
1. Validate input parameters (e.g., empty strings, None values)
|
| 484 |
+
2. Wrap `PineconeClient.fetch_standard()` call in try/except
|
| 485 |
+
3. Catch `PineconeException` and map to appropriate `error_type`
|
| 486 |
+
4. Handle None result (not found)
|
| 487 |
+
5. Return structured JSON error responses
|
| 488 |
+
6. Never raise exceptions - always return JSON response
|
| 489 |
+
|
| 490 |
+
**Error Mapping:**
|
| 491 |
+
|
| 492 |
+
- `PineconeException` → `error_type: "api_error"`
|
| 493 |
+
- Empty `activity` or `standard_id` → `error_type: "invalid_input"`
|
| 494 |
+
- No results from query → `error_type: "no_results"`
|
| 495 |
+
- ID not found in fetch (returns None) → `error_type: "not_found"`
|
| 496 |
+
|
| 497 |
+
---
|
| 498 |
+
|
| 499 |
+
## Dependencies
|
| 500 |
+
|
| 501 |
+
The `mcp` package is already included in `pyproject.toml`. Ensure `pinecone` is also available (already included). No additional dependencies are required for this sprint.
|
| 502 |
+
|
| 503 |
+
---
|
| 504 |
+
|
| 505 |
+
## Running the Server
|
| 506 |
+
|
| 507 |
+
### Local Development
|
| 508 |
+
|
| 509 |
+
1. Ensure environment variables are set in `.env`:
|
| 510 |
+
|
| 511 |
+
```
|
| 512 |
+
PINECONE_API_KEY=your_api_key_here
|
| 513 |
+
PINECONE_INDEX_NAME=common-core-standards
|
| 514 |
+
PINECONE_NAMESPACE=standards
|
| 515 |
+
```
|
| 516 |
+
|
| 517 |
+
2. Run the server:
|
| 518 |
+
|
| 519 |
+
```bash
|
| 520 |
+
uv run server.py
|
| 521 |
+
```
|
| 522 |
+
|
| 523 |
+
Or:
|
| 524 |
+
|
| 525 |
+
```bash
|
| 526 |
+
python server.py
|
| 527 |
+
```
|
| 528 |
+
|
| 529 |
+
3. The server communicates via stdio. FastMCP handles the MCP protocol transport automatically.
|
| 530 |
+
|
| 531 |
+
### Claude Desktop Integration
|
| 532 |
+
|
| 533 |
+
To connect Claude Desktop to the local MCP server, add configuration:
|
| 534 |
+
|
| 535 |
+
**macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json`
|
| 536 |
+
**Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
|
| 537 |
+
|
| 538 |
+
```json
|
| 539 |
+
{
|
| 540 |
+
"mcpServers": {
|
| 541 |
+
"common-core": {
|
| 542 |
+
"command": "uv",
|
| 543 |
+
"args": ["run", "server.py"],
|
| 544 |
+
"cwd": "/absolute/path/to/common_core_mcp"
|
| 545 |
+
}
|
| 546 |
+
}
|
| 547 |
+
}
|
| 548 |
+
```
|
| 549 |
+
|
| 550 |
+
**Important:** Replace `/absolute/path/to/common_core_mcp` with the actual absolute path to your project directory.
|
| 551 |
+
|
| 552 |
+
---
|
| 553 |
+
|
| 554 |
+
## Testing
|
| 555 |
+
|
| 556 |
+
Skip tests for this sprint. Focus on getting the server working first. Tests can be added in a future sprint.
|
| 557 |
+
|
| 558 |
+
### Manual Validation
|
| 559 |
+
|
| 560 |
+
To validate the server works:
|
| 561 |
+
|
| 562 |
+
1. Run `server.py` and verify it starts without errors
|
| 563 |
+
2. Connect Claude Desktop and verify tools appear
|
| 564 |
+
3. Test `find_relevant_standards` with a sample activity
|
| 565 |
+
4. Test `get_standard_details` with a known GUID
|
| 566 |
+
5. Test error cases (invalid ID, empty query, etc.)
|
| 567 |
+
|
| 568 |
+
---
|
| 569 |
+
|
| 570 |
+
## Limitations and Future Work
|
| 571 |
+
|
| 572 |
+
- **Tools Only**: The MCP server only supports tools for now. Prompts and resources are not included in this sprint.
|
| 573 |
+
- **No Reasoning**: The server does not include LLM reasoning/explanations for why standards match activities. This matches the MVP spec's `ask_llama` functionality but is deferred for now.
|
| 574 |
+
- **Limited Filters**: Only `grade` and `subject` filters are supported initially. Additional filters (e.g., `is_leaf`, `standard_set_id`, `jurisdiction_id`) can be added in future sprints.
|
.agent/specs/002_mcp/tasks.md
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Spec Tasks
|
| 2 |
+
|
| 3 |
+
## Tasks
|
| 4 |
+
|
| 5 |
+
- [x] 1. Create MCP Configuration Module
|
| 6 |
+
|
| 7 |
+
- [x] 1.1 Create `src/mcp_config.py` with `McpSettings` class using Pydantic BaseSettings
|
| 8 |
+
- [x] 1.2 Add `pinecone_api_key` (required), `pinecone_index_name` (default: "common-core-standards"), and `pinecone_namespace` (default: "standards") fields
|
| 9 |
+
- [x] 1.3 Configure to load from `.env` file with `env_file=".env"` and `env_file_encoding="utf-8"`
|
| 10 |
+
- [x] 1.4 Create `get_mcp_settings()` function that returns a singleton `McpSettings` instance
|
| 11 |
+
- [x] 1.5 Add validation to ensure `pinecone_api_key` is not empty (raise ValueError if missing)
|
| 12 |
+
|
| 13 |
+
- [x] 2. Move PineconeClient to src/ Directory
|
| 14 |
+
|
| 15 |
+
- [x] 2.1 Move `tools/pinecone_client.py` to `src/pinecone_client.py`
|
| 16 |
+
- [x] 2.2 Update imports in `src/pinecone_client.py`: change `from tools.config import get_settings` to use `src.mcp_config.get_mcp_settings()` instead
|
| 17 |
+
- [x] 2.3 Update `PineconeClient.__init__()` to use `get_mcp_settings()` from `src.mcp_config`
|
| 18 |
+
- [x] 2.4 Verify `src/pinecone_client.py` imports `PineconeRecord` from `tools.pinecone_models` (keep this import for now)
|
| 19 |
+
|
| 20 |
+
- [x] 3. Update Tools Imports to Use src.pinecone_client
|
| 21 |
+
|
| 22 |
+
- [x] 3.1 Update `tools/cli.py`: replace `from tools.pinecone_client import PineconeClient` with `from src.pinecone_client import PineconeClient` (2 occurrences)
|
| 23 |
+
- [x] 3.2 Check `tools/pinecone_processor.py` for any `pinecone_client` imports and update if present
|
| 24 |
+
- [x] 3.3 Verify all imports work correctly by checking for any remaining references to `tools.pinecone_client`
|
| 25 |
+
|
| 26 |
+
- [x] 4. Add search_standards() Method to PineconeClient
|
| 27 |
+
|
| 28 |
+
- [x] 4.1 Add `search_standards()` method signature: `def search_standards(self, query_text: str, top_k: int = 5, grade: str | None = None, subject: str | None = None) -> list[dict]`
|
| 29 |
+
- [x] 4.2 Build filter dictionary dynamically: create empty list, add `{"education_levels": {"$in": [grade]}}` if grade provided, add `{"subject": {"$eq": subject}}` if subject provided, combine with `$and` if both exist
|
| 30 |
+
- [x] 4.3 Build query dictionary: `{"inputs": {"text": query_text}, "top_k": top_k * 2}` (double for reranking candidates), add `"filter": filter_dict` only if filter_dict exists
|
| 31 |
+
- [x] 4.4 Call `index.search()` with `namespace=self.namespace`, `query=query_dict`, and `rerank={"model": "bge-reranker-v2-m3", "top_n": top_k, "rank_fields": ["content"]}`
|
| 32 |
+
- [x] 4.5 Parse results: access `results['result']['hits']`, extract `_id`, `_score`, and `fields` from each hit, combine into dict with `{"_id": hit["_id"], "score": hit["_score"], **hit["fields"]}`
|
| 33 |
+
- [x] 4.6 Return list of result dictionaries
|
| 34 |
+
|
| 35 |
+
- [x] 5. Add fetch_standard() Method to PineconeClient
|
| 36 |
+
|
| 37 |
+
- [x] 5.1 Add `fetch_standard()` method signature: `def fetch_standard(self, standard_id: str) -> dict | None`
|
| 38 |
+
- [x] 5.2 Call `index.fetch()` with `ids=[standard_id]` and `namespace=self.namespace`
|
| 39 |
+
- [x] 5.3 Extract result from `result.records` dictionary using `standard_id` as key
|
| 40 |
+
- [x] 5.4 If record found, extract `_id` and all fields from `record.fields`, combine into dict
|
| 41 |
+
- [x] 5.5 Return dictionary with metadata or `None` if not found
|
| 42 |
+
|
| 43 |
+
- [x] 6. Create search.py Module with Semantic Search Implementation
|
| 44 |
+
|
| 45 |
+
- [x] 6.1 Create `src/search.py` with imports: `json`, `PineconeClient` from `src.pinecone_client`, `PineconeException` from `pinecone.exceptions`
|
| 46 |
+
- [x] 6.2 Implement `find_relevant_standards_impl()` function with signature matching spec (activity, max_results=5, grade=None, subject=None) -> str
|
| 47 |
+
- [x] 6.3 Add input validation: check if `activity` is empty or None, return error JSON with `error_type: "invalid_input"` if invalid
|
| 48 |
+
- [x] 6.4 Wrap `PineconeClient.search_standards()` call in try/except, catch `PineconeException` and return error JSON with `error_type: "api_error"`
|
| 49 |
+
- [x] 6.5 Handle empty results: if results list is empty, return error JSON with `error_type: "no_results"` and message "No matching standards found"
|
| 50 |
+
- [x] 6.6 Format successful results: create response dict with `success: True`, `results` list (each with `_id`, `score`, and all metadata fields), and `message` with count
|
| 51 |
+
- [x] 6.7 Return JSON string using `json.dumps()` with proper formatting
|
| 52 |
+
|
| 53 |
+
- [x] 7. Create lookup.py Module with Direct ID Lookup Implementation
|
| 54 |
+
|
| 55 |
+
- [x] 7.1 Create `src/lookup.py` with imports: `json`, `PineconeClient` from `src.pinecone_client`, `PineconeException` from `pinecone.exceptions`
|
| 56 |
+
- [x] 7.2 Implement `get_standard_details_impl()` function with signature: `(standard_id: str) -> str`
|
| 57 |
+
- [x] 7.3 Add input validation: check if `standard_id` is empty or None, return error JSON with `error_type: "invalid_input"` if invalid
|
| 58 |
+
- [x] 7.4 Wrap `PineconeClient.fetch_standard()` call in try/except, catch `PineconeException` and return error JSON with `error_type: "api_error"`
|
| 59 |
+
- [x] 7.5 Handle not found: if `fetch_standard()` returns `None`, return error JSON with `error_type: "not_found"` and helpful message suggesting to use `find_relevant_standards`
|
| 60 |
+
- [x] 7.6 Format successful result: create response dict with `success: True`, `results` list containing single standard dict with all metadata, and `message: "Retrieved standard details"`
|
| 61 |
+
- [x] 7.7 Return JSON string using `json.dumps()` with proper formatting
|
| 62 |
+
|
| 63 |
+
- [x] 8. Create server.py MCP Entry Point
|
| 64 |
+
- [x] 8.1 Create `server.py` in project root with imports: `FastMCP` from `mcp.server.fastmcp`, `find_relevant_standards_impl` from `src.search`, `get_standard_details_impl` from `src.lookup`
|
| 65 |
+
- [x] 8.2 Initialize FastMCP server: `mcp = FastMCP("CommonCore")`
|
| 66 |
+
- [x] 8.3 Define `find_relevant_standards` tool with `@mcp.tool()` decorator, signature matching spec (activity, max_results=5, grade=None, subject=None) -> str, docstring "Returns educational standards relevant to the activity"
|
| 67 |
+
- [x] 8.4 Define `get_standard_details` tool with `@mcp.tool()` decorator, signature `(standard_id: str) -> str`, docstring "Returns full metadata for a standard by its GUID or identifier"
|
| 68 |
+
- [x] 8.5 Add `if __name__ == "__main__": mcp.run()` block to start server
|
| 69 |
+
- [x] 8.6 Verify server starts without errors by running `uv run server.py`
|
.agent/specs/003_gradio/spec.md
ADDED
|
@@ -0,0 +1,1167 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Gradio MCP Server Sprint Specification
|
| 2 |
+
|
| 3 |
+
## Overview
|
| 4 |
+
|
| 5 |
+
This sprint replaces the existing FastMCP server implementation with a Gradio-based MCP server that can be hosted publicly on Hugging Face Spaces. The Gradio app will expose the Common Core Standards MCP tools and include a chat interface that demonstrates MCP tool calling capabilities. This enables public access to the MCP server and provides a demonstration interface for the hackathon submission.
|
| 6 |
+
|
| 7 |
+
## Key Changes
|
| 8 |
+
|
| 9 |
+
- Replace `server.py` (FastMCP) with `app.py` (Gradio MCP server)
|
| 10 |
+
- Update dependencies to use Gradio 6.0.0+ with MCP support
|
| 11 |
+
- Remove FastMCP dependency from `pyproject.toml`
|
| 12 |
+
- Create Hugging Face Space configuration files
|
| 13 |
+
- Implement chat interface with MCP tool calling support
|
| 14 |
+
- Update README to meet hackathon requirements
|
| 15 |
+
|
| 16 |
+
## User Stories
|
| 17 |
+
|
| 18 |
+
1. **As a developer**, I want to access the MCP server via a public Hugging Face Space URL so that I can use it from any MCP client without running it locally.
|
| 19 |
+
|
| 20 |
+
2. **As a user**, I want to interact with a chat interface that can answer questions about educational standards using the MCP tools, so that I can see how the MCP server works in practice.
|
| 21 |
+
|
| 22 |
+
3. **As a hackathon judge**, I want to see a working MCP server hosted on Hugging Face Spaces with proper documentation, so that I can evaluate the submission.
|
| 23 |
+
|
| 24 |
+
## Technical Architecture
|
| 25 |
+
|
| 26 |
+
### Gradio MCP Server Implementation
|
| 27 |
+
|
| 28 |
+
Gradio 6 introduces native MCP server support. When `mcp_server=True` is set in `demo.launch()`, Gradio automatically:
|
| 29 |
+
|
| 30 |
+
1. Converts each API endpoint (function) into an MCP tool
|
| 31 |
+
2. Uses function docstrings and type hints to generate tool descriptions and parameter schemas
|
| 32 |
+
3. Exposes the MCP server at `http://your-server:port/gradio_api/mcp/`
|
| 33 |
+
4. Provides an SSE (Server-Sent Events) endpoint for MCP clients
|
| 34 |
+
|
| 35 |
+
### Function to MCP Tool Conversion
|
| 36 |
+
|
| 37 |
+
Gradio automatically converts functions with proper docstrings and type hints into MCP tools:
|
| 38 |
+
|
| 39 |
+
- **Function name** → Tool name
|
| 40 |
+
- **Docstring** → Tool description
|
| 41 |
+
- **Type hints** → Parameter schema
|
| 42 |
+
- **Default values** → Default parameter values (from component initial values)
|
| 43 |
+
|
| 44 |
+
### MCP Server Endpoints
|
| 45 |
+
|
| 46 |
+
When `mcp_server=True` is enabled:
|
| 47 |
+
|
| 48 |
+
- **MCP Schema**: `http://your-server:port/gradio_api/mcp/schema`
|
| 49 |
+
- **MCP SSE Endpoint**: `http://your-server:port/gradio_api/mcp/` (for MCP clients)
|
| 50 |
+
- **MCP Documentation**: Available via "View API" link in Gradio app footer
|
| 51 |
+
|
| 52 |
+
### MCP Server Activation
|
| 53 |
+
|
| 54 |
+
**We will use the `demo.launch(mcp_server=True)` parameter approach** (not the environment variable method). This provides explicit control and makes the MCP server activation clear in the code.
|
| 55 |
+
|
| 56 |
+
## Implementation Details
|
| 57 |
+
|
| 58 |
+
### Dependencies Update
|
| 59 |
+
|
| 60 |
+
**File: `pyproject.toml`**
|
| 61 |
+
|
| 62 |
+
**Explicit Requirement**: Update Gradio dependency to version 6.0.0 or higher **with MCP extras**:
|
| 63 |
+
|
| 64 |
+
```toml
|
| 65 |
+
dependencies = [
|
| 66 |
+
"gradio[mcp]>=6.0.0",
|
| 67 |
+
"pinecone",
|
| 68 |
+
"python-dotenv",
|
| 69 |
+
"typer",
|
| 70 |
+
"requests",
|
| 71 |
+
"rich",
|
| 72 |
+
"loguru",
|
| 73 |
+
"pydantic>=2.0.0",
|
| 74 |
+
"pydantic-settings>=2.0.0",
|
| 75 |
+
"huggingface_hub",
|
| 76 |
+
]
|
| 77 |
+
```
|
| 78 |
+
|
| 79 |
+
**Important Notes:**
|
| 80 |
+
|
| 81 |
+
- The `[mcp]` extra ensures all MCP dependencies are installed
|
| 82 |
+
- Remove the standalone `mcp` package dependency if present (FastMCP is no longer used)
|
| 83 |
+
- Add `huggingface_hub` for Inference API access in the chat interface
|
| 84 |
+
|
| 85 |
+
### Gradio App Structure
|
| 86 |
+
|
| 87 |
+
**File: `app.py`** (new file, replaces `server.py`)
|
| 88 |
+
|
| 89 |
+
The Gradio app should:
|
| 90 |
+
|
| 91 |
+
1. **Expose MCP Tools**: Create functions that wrap the existing `src/search.py` and `src/lookup.py` implementations
|
| 92 |
+
2. **Enable MCP Server**: Set `mcp_server=True` in `demo.launch()`
|
| 93 |
+
3. **Include Chat Interface**: Use `gr.ChatInterface` with a function that supports MCP tool calling
|
| 94 |
+
|
| 95 |
+
**Function Requirements for MCP Tools:**
|
| 96 |
+
|
| 97 |
+
- Functions must have detailed docstrings in the format:
|
| 98 |
+
|
| 99 |
+
```python
|
| 100 |
+
def function_name(param1: type, param2: type) -> return_type:
|
| 101 |
+
"""
|
| 102 |
+
Description of what the function does.
|
| 103 |
+
|
| 104 |
+
Args:
|
| 105 |
+
param1: Description of param1
|
| 106 |
+
param2: Description of param2
|
| 107 |
+
|
| 108 |
+
Returns:
|
| 109 |
+
Description of return value
|
| 110 |
+
"""
|
| 111 |
+
```
|
| 112 |
+
|
| 113 |
+
- Type hints are required for all parameters
|
| 114 |
+
- Default values can be set via component initial values (e.g., `gr.Textbox("default value")`)
|
| 115 |
+
|
| 116 |
+
**Example Structure:**
|
| 117 |
+
|
| 118 |
+
```python
|
| 119 |
+
import gradio as gr
|
| 120 |
+
from src.search import find_relevant_standards_impl
|
| 121 |
+
from src.lookup import get_standard_details_impl
|
| 122 |
+
|
| 123 |
+
def find_relevant_standards(
|
| 124 |
+
activity: str,
|
| 125 |
+
max_results: int = 5,
|
| 126 |
+
grade: str | None = None,
|
| 127 |
+
subject: str | None = None,
|
| 128 |
+
) -> str:
|
| 129 |
+
"""
|
| 130 |
+
Searches for educational standards relevant to a learning activity using semantic search.
|
| 131 |
+
|
| 132 |
+
This function performs a vector similarity search over the Common Core Standards database
|
| 133 |
+
to find standards that match the described learning activity. Results are ranked by relevance
|
| 134 |
+
and can be filtered by grade level and subject area.
|
| 135 |
+
|
| 136 |
+
Args:
|
| 137 |
+
activity: A natural language description of the learning activity, lesson, or educational
|
| 138 |
+
objective. Examples: "teaching fractions to third graders", "reading comprehension
|
| 139 |
+
activities", "solving quadratic equations". This is the primary search query and should
|
| 140 |
+
be descriptive and specific for best results.
|
| 141 |
+
|
| 142 |
+
max_results: The maximum number of standards to return. Must be between 1 and 20.
|
| 143 |
+
Default is 5. Higher values return more results but may include less relevant matches.
|
| 144 |
+
|
| 145 |
+
grade: Optional grade level filter. Must be one of the following valid grade level codes:
|
| 146 |
+
- "K" for Kindergarten
|
| 147 |
+
- "01" for Grade 1
|
| 148 |
+
- "02" for Grade 2
|
| 149 |
+
- "03" for Grade 3
|
| 150 |
+
- "04" for Grade 4
|
| 151 |
+
- "05" for Grade 5
|
| 152 |
+
- "06" for Grade 6
|
| 153 |
+
- "07" for Grade 7
|
| 154 |
+
- "08" for Grade 8
|
| 155 |
+
- "09" for Grade 9
|
| 156 |
+
- "10" for Grade 10
|
| 157 |
+
- "11" for Grade 11
|
| 158 |
+
- "12" for Grade 12
|
| 159 |
+
- "09-12" for high school range (when standards span multiple high school grades)
|
| 160 |
+
|
| 161 |
+
If None or empty string, no grade filtering is applied and standards from all grade
|
| 162 |
+
levels may be returned. The grade filter uses exact matching against the education_levels
|
| 163 |
+
metadata field in the database.
|
| 164 |
+
|
| 165 |
+
subject: Optional subject area filter. Common values include:
|
| 166 |
+
- "Mathematics" or "Math"
|
| 167 |
+
- "ELA-Literacy" or "English Language Arts"
|
| 168 |
+
- "Science"
|
| 169 |
+
- "Social Studies"
|
| 170 |
+
- Other subject names as they appear in the standards database
|
| 171 |
+
|
| 172 |
+
If None or empty string, no subject filtering is applied. The subject filter uses
|
| 173 |
+
case-insensitive matching against the subject metadata field.
|
| 174 |
+
|
| 175 |
+
Returns:
|
| 176 |
+
A JSON string containing a structured response with the following format:
|
| 177 |
+
{
|
| 178 |
+
"success": true|false,
|
| 179 |
+
"results": [
|
| 180 |
+
{
|
| 181 |
+
"_id": "standard_guid",
|
| 182 |
+
"content": "full standard text with hierarchy",
|
| 183 |
+
"subject": "Mathematics",
|
| 184 |
+
"education_levels": ["03"],
|
| 185 |
+
"statement_notation": "3.NF.A.1",
|
| 186 |
+
"standard_set_title": "Grade 3",
|
| 187 |
+
"score": 0.85
|
| 188 |
+
},
|
| 189 |
+
...
|
| 190 |
+
],
|
| 191 |
+
"message": "Found N matching standards" or error message,
|
| 192 |
+
"error_type": null or error type if success is false
|
| 193 |
+
}
|
| 194 |
+
|
| 195 |
+
On success, the results array contains up to max_results standards, sorted by relevance
|
| 196 |
+
score (highest first). Each result includes the full standard content, metadata, and
|
| 197 |
+
relevance score. On error, success is false and an error message describes the issue.
|
| 198 |
+
"""
|
| 199 |
+
# Handle empty string from dropdown (convert to None)
|
| 200 |
+
if grade == "":
|
| 201 |
+
grade = None
|
| 202 |
+
if subject == "":
|
| 203 |
+
subject = None
|
| 204 |
+
|
| 205 |
+
# Ensure max_results is an integer (gr.Number returns float by default)
|
| 206 |
+
max_results = int(max_results)
|
| 207 |
+
|
| 208 |
+
return find_relevant_standards_impl(activity, max_results, grade, subject)
|
| 209 |
+
|
| 210 |
+
def get_standard_details(standard_id: str) -> str:
|
| 211 |
+
"""
|
| 212 |
+
Retrieves complete metadata and content for a specific educational standard by its identifier.
|
| 213 |
+
|
| 214 |
+
This function performs a direct lookup of a standard using its unique identifier. The identifier
|
| 215 |
+
can be either the standard's GUID (a unique UUID-like string) or its statement notation
|
| 216 |
+
(the human-readable code like "3.NF.A.1" or "CCSS.Math.Content.3.NF.A.1").
|
| 217 |
+
|
| 218 |
+
Args:
|
| 219 |
+
standard_id: The unique identifier for the standard. This can be:
|
| 220 |
+
- A GUID (e.g., "EA60C8D165F6481B90BFF782CE193F93"): The internal database ID
|
| 221 |
+
- A statement notation (e.g., "3.NF.A.1"): The standard's notation code
|
| 222 |
+
- An ASN identifier (e.g., "S21238682"): If available in the standard's metadata
|
| 223 |
+
|
| 224 |
+
The function will attempt to match the identifier against multiple fields in the database.
|
| 225 |
+
GUIDs provide the fastest and most reliable lookup. Statement notations may match
|
| 226 |
+
multiple standards if the notation format is ambiguous.
|
| 227 |
+
|
| 228 |
+
Returns:
|
| 229 |
+
A JSON string containing a structured response with the following format:
|
| 230 |
+
{
|
| 231 |
+
"success": true|false,
|
| 232 |
+
"results": [
|
| 233 |
+
{
|
| 234 |
+
"_id": "standard_guid",
|
| 235 |
+
"content": "full standard text with hierarchy",
|
| 236 |
+
"subject": "Mathematics",
|
| 237 |
+
"education_levels": ["03"],
|
| 238 |
+
"statement_notation": "3.NF.A.1",
|
| 239 |
+
"standard_set_title": "Grade 3",
|
| 240 |
+
"asn_identifier": "S21238682",
|
| 241 |
+
"depth": 3,
|
| 242 |
+
"is_leaf": true,
|
| 243 |
+
"parent_id": "parent_guid",
|
| 244 |
+
"ancestor_ids": [...],
|
| 245 |
+
"child_ids": [...],
|
| 246 |
+
... (all available metadata fields)
|
| 247 |
+
}
|
| 248 |
+
],
|
| 249 |
+
"message": "Retrieved standard details" or error message,
|
| 250 |
+
"error_type": null or error type if success is false
|
| 251 |
+
}
|
| 252 |
+
|
| 253 |
+
On success, the results array contains exactly one standard object with all available
|
| 254 |
+
metadata fields including hierarchy relationships, content, and identifiers. On error
|
| 255 |
+
(e.g., standard not found), success is false and the message provides guidance, such as
|
| 256 |
+
suggesting to use find_relevant_standards for searching.
|
| 257 |
+
|
| 258 |
+
Raises:
|
| 259 |
+
This function does not raise exceptions. All errors are returned as JSON responses
|
| 260 |
+
with success=false and appropriate error messages.
|
| 261 |
+
"""
|
| 262 |
+
return get_standard_details_impl(standard_id)
|
| 263 |
+
|
| 264 |
+
# Chat interface function - see complete implementation in Chat Interface Implementation section below
|
| 265 |
+
|
| 266 |
+
# Create Gradio interface
|
| 267 |
+
demo = gr.TabbedInterface(
|
| 268 |
+
[
|
| 269 |
+
gr.Interface(
|
| 270 |
+
fn=find_relevant_standards,
|
| 271 |
+
inputs=[
|
| 272 |
+
gr.Textbox(label="Activity Description", placeholder="Describe a learning activity..."),
|
| 273 |
+
gr.Number(label="Max Results", value=5, minimum=1, maximum=20),
|
| 274 |
+
gr.Dropdown(
|
| 275 |
+
label="Grade (optional)",
|
| 276 |
+
choices=["", "K", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "09-12"],
|
| 277 |
+
value=None,
|
| 278 |
+
info="Select a grade level to filter results"
|
| 279 |
+
),
|
| 280 |
+
gr.Textbox(label="Subject (optional)", placeholder="e.g., Mathematics, ELA-Literacy"),
|
| 281 |
+
],
|
| 282 |
+
outputs=gr.JSON(label="Results"),
|
| 283 |
+
title="Find Relevant Standards",
|
| 284 |
+
description="Search for educational standards relevant to a learning activity.",
|
| 285 |
+
api_name="find_relevant_standards",
|
| 286 |
+
),
|
| 287 |
+
gr.Interface(
|
| 288 |
+
fn=get_standard_details,
|
| 289 |
+
inputs=gr.Textbox(label="Standard ID", placeholder="Enter a standard GUID or identifier..."),
|
| 290 |
+
outputs=gr.JSON(label="Standard Details"),
|
| 291 |
+
title="Get Standard Details",
|
| 292 |
+
description="Retrieve full metadata for a specific standard by its ID.",
|
| 293 |
+
api_name="get_standard_details",
|
| 294 |
+
),
|
| 295 |
+
gr.ChatInterface(
|
| 296 |
+
fn=chat_with_standards, # See complete implementation in Chat Interface Implementation section
|
| 297 |
+
type="messages", # Required in Gradio 6 - uses OpenAI-style message format
|
| 298 |
+
title="Chat with Standards",
|
| 299 |
+
description="Ask questions about educational standards. The AI will use MCP tools to find relevant information.",
|
| 300 |
+
examples=["What standards apply to teaching fractions in 3rd grade?", "Find standards for reading comprehension"],
|
| 301 |
+
),
|
| 302 |
+
],
|
| 303 |
+
["Search", "Lookup", "Chat"],
|
| 304 |
+
)
|
| 305 |
+
|
| 306 |
+
if __name__ == "__main__":
|
| 307 |
+
demo.launch(mcp_server=True)
|
| 308 |
+
```
|
| 309 |
+
|
| 310 |
+
### Chat Interface Implementation
|
| 311 |
+
|
| 312 |
+
**Priority: First Priority** - The chat interface is a required deliverable for this sprint.
|
| 313 |
+
|
| 314 |
+
**Minimum Viable Implementation:**
|
| 315 |
+
|
| 316 |
+
- Use Hugging Face Inference API with a free/open model that supports MCP tool calling
|
| 317 |
+
- Model should be able to call the MCP tools (`find_relevant_standards` and `get_standard_details`)
|
| 318 |
+
- Chat function should integrate with the MCP server to answer questions about educational requirements
|
| 319 |
+
|
| 320 |
+
**Model Selection (Researched and Verified):**
|
| 321 |
+
|
| 322 |
+
**Selected Model: `Qwen/Qwen2.5-7B-Instruct`**
|
| 323 |
+
|
| 324 |
+
This model has been verified to:
|
| 325 |
+
|
| 326 |
+
- Support tool/function calling via Hugging Face Inference API
|
| 327 |
+
- Be available through Inference Providers (Together AI, Featherless AI)
|
| 328 |
+
- Have good performance for chat applications
|
| 329 |
+
- Support the OpenAI-compatible function calling format used by InferenceClient
|
| 330 |
+
- Be actively maintained and widely used (57.9M+ downloads as of research date)
|
| 331 |
+
|
| 332 |
+
**Important:** The model requires specifying an inference provider (e.g., `provider="together"` or `provider="nebius"`) when using InferenceClient.
|
| 333 |
+
|
| 334 |
+
**Alternative (for more complex queries):** `Qwen/Qwen2.5-72B-Instruct` (larger, more capable, available via Nebius provider)
|
| 335 |
+
|
| 336 |
+
**Implementation Details:**
|
| 337 |
+
|
| 338 |
+
The chat function will use Hugging Face's `InferenceClient` with function calling. Since the MCP tools (`find_relevant_standards` and `get_standard_details`) are exposed by the same Gradio app, we can call them directly as Python functions rather than making HTTP requests to the MCP server endpoint. This is more efficient and simpler.
|
| 339 |
+
|
| 340 |
+
**Complete Chat Function Implementation:**
|
| 341 |
+
|
| 342 |
+
```python
|
| 343 |
+
import os
|
| 344 |
+
import json
|
| 345 |
+
from typing import Any
|
| 346 |
+
from huggingface_hub import InferenceClient
|
| 347 |
+
from src.search import find_relevant_standards_impl
|
| 348 |
+
from src.lookup import get_standard_details_impl
|
| 349 |
+
|
| 350 |
+
# Initialize the Hugging Face Inference Client
|
| 351 |
+
# Use HF_TOKEN from environment (automatically available in Hugging Face Spaces)
|
| 352 |
+
# Provider is required for models that need Inference Providers (e.g., Together AI, Nebius)
|
| 353 |
+
HF_TOKEN = os.environ.get("HF_TOKEN")
|
| 354 |
+
client = InferenceClient(
|
| 355 |
+
provider="together", # Required: specifies the inference provider for tool calling
|
| 356 |
+
token=HF_TOKEN
|
| 357 |
+
)
|
| 358 |
+
|
| 359 |
+
# Define the function schemas in OpenAI format for the model
|
| 360 |
+
TOOLS = [
|
| 361 |
+
{
|
| 362 |
+
"type": "function",
|
| 363 |
+
"function": {
|
| 364 |
+
"name": "find_relevant_standards",
|
| 365 |
+
"description": "Searches for educational standards relevant to a learning activity using semantic search. Use this when the user asks about standards for a specific activity, lesson, or educational objective.",
|
| 366 |
+
"parameters": {
|
| 367 |
+
"type": "object",
|
| 368 |
+
"properties": {
|
| 369 |
+
"activity": {
|
| 370 |
+
"type": "string",
|
| 371 |
+
"description": "A natural language description of the learning activity, lesson, or educational objective. Be specific and descriptive."
|
| 372 |
+
},
|
| 373 |
+
"max_results": {
|
| 374 |
+
"type": "integer",
|
| 375 |
+
"description": "Maximum number of standards to return (1-20). Default is 5.",
|
| 376 |
+
"default": 5,
|
| 377 |
+
"minimum": 1,
|
| 378 |
+
"maximum": 20
|
| 379 |
+
},
|
| 380 |
+
"grade": {
|
| 381 |
+
"type": "string",
|
| 382 |
+
"description": "Optional grade level filter. Valid values: K, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, or 09-12 for high school range.",
|
| 383 |
+
"enum": ["K", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "09-12"]
|
| 384 |
+
},
|
| 385 |
+
"subject": {
|
| 386 |
+
"type": "string",
|
| 387 |
+
"description": "Optional subject area filter (e.g., 'Mathematics', 'ELA-Literacy', 'Science')."
|
| 388 |
+
}
|
| 389 |
+
},
|
| 390 |
+
"required": ["activity"]
|
| 391 |
+
}
|
| 392 |
+
}
|
| 393 |
+
},
|
| 394 |
+
{
|
| 395 |
+
"type": "function",
|
| 396 |
+
"function": {
|
| 397 |
+
"name": "get_standard_details",
|
| 398 |
+
"description": "Retrieves complete metadata and content for a specific educational standard by its identifier (GUID or statement notation). Use this when the user asks about a specific standard or wants details about a standard mentioned in previous results.",
|
| 399 |
+
"parameters": {
|
| 400 |
+
"type": "object",
|
| 401 |
+
"properties": {
|
| 402 |
+
"standard_id": {
|
| 403 |
+
"type": "string",
|
| 404 |
+
"description": "The unique identifier for the standard. Can be a GUID (UUID-like string) or statement notation (e.g., '3.NF.A.1')."
|
| 405 |
+
}
|
| 406 |
+
},
|
| 407 |
+
"required": ["standard_id"]
|
| 408 |
+
}
|
| 409 |
+
}
|
| 410 |
+
}
|
| 411 |
+
]
|
| 412 |
+
|
| 413 |
+
# Function registry for executing tool calls
|
| 414 |
+
AVAILABLE_FUNCTIONS = {
|
| 415 |
+
"find_relevant_standards": find_relevant_standards_impl,
|
| 416 |
+
"get_standard_details": get_standard_details_impl,
|
| 417 |
+
}
|
| 418 |
+
|
| 419 |
+
def chat_with_standards(message: str, history: list) -> str:
|
| 420 |
+
"""
|
| 421 |
+
Chat function that uses MCP tools via Hugging Face Inference API with tool calling.
|
| 422 |
+
|
| 423 |
+
This function integrates with Qwen2.5-7B-Instruct to answer questions about educational
|
| 424 |
+
standards. The model can call find_relevant_standards and get_standard_details tools
|
| 425 |
+
to retrieve information and provide accurate responses.
|
| 426 |
+
|
| 427 |
+
Args:
|
| 428 |
+
message: The user's current message/query
|
| 429 |
+
history: Chat history in Gradio 6 messages format. Each message is a dict with
|
| 430 |
+
"role" and "content" keys. In Gradio 6, content uses structured format:
|
| 431 |
+
[{"type": "text", "text": "..."}, ...] for text content.
|
| 432 |
+
|
| 433 |
+
Returns:
|
| 434 |
+
The assistant's response as a string, incorporating information from MCP tools
|
| 435 |
+
when relevant.
|
| 436 |
+
"""
|
| 437 |
+
# Convert Gradio 6 history format to OpenAI messages format
|
| 438 |
+
# Gradio 6 uses structured content: {"role": "user", "content": [{"type": "text", "text": "..."}]}
|
| 439 |
+
messages = []
|
| 440 |
+
if history:
|
| 441 |
+
for msg in history:
|
| 442 |
+
if isinstance(msg, dict):
|
| 443 |
+
role = msg.get("role", "user")
|
| 444 |
+
content = msg.get("content", "")
|
| 445 |
+
|
| 446 |
+
# Handle Gradio 6 structured content format
|
| 447 |
+
if isinstance(content, list):
|
| 448 |
+
# Extract text from content blocks
|
| 449 |
+
text_parts = []
|
| 450 |
+
for block in content:
|
| 451 |
+
if isinstance(block, dict) and block.get("type") == "text":
|
| 452 |
+
text_parts.append(block.get("text", ""))
|
| 453 |
+
content = " ".join(text_parts)
|
| 454 |
+
|
| 455 |
+
messages.append({
|
| 456 |
+
"role": role,
|
| 457 |
+
"content": content
|
| 458 |
+
})
|
| 459 |
+
|
| 460 |
+
# Add system message to guide the model
|
| 461 |
+
system_message = {
|
| 462 |
+
"role": "system",
|
| 463 |
+
"content": "You are a helpful assistant that answers questions about educational standards. You have access to tools that can search for standards and retrieve standard details. Use these tools when users ask about standards, learning activities, or educational requirements. Always provide clear, helpful responses based on the tool results."
|
| 464 |
+
}
|
| 465 |
+
|
| 466 |
+
# Add current user message
|
| 467 |
+
messages.append({"role": "user", "content": message})
|
| 468 |
+
|
| 469 |
+
# Prepare full message list with system message
|
| 470 |
+
full_messages = [system_message] + messages
|
| 471 |
+
|
| 472 |
+
try:
|
| 473 |
+
# Initial API call with tools
|
| 474 |
+
response = client.chat.completions.create(
|
| 475 |
+
model="Qwen/Qwen2.5-7B-Instruct",
|
| 476 |
+
messages=full_messages,
|
| 477 |
+
tools=TOOLS,
|
| 478 |
+
tool_choice="auto", # Let the model decide when to call functions
|
| 479 |
+
temperature=0.7,
|
| 480 |
+
max_tokens=1000,
|
| 481 |
+
)
|
| 482 |
+
|
| 483 |
+
response_message = response.choices[0].message
|
| 484 |
+
|
| 485 |
+
# Check if model wants to call functions
|
| 486 |
+
if response_message.tool_calls:
|
| 487 |
+
# Add assistant's tool call request to messages
|
| 488 |
+
full_messages.append(response_message)
|
| 489 |
+
|
| 490 |
+
# Process each tool call
|
| 491 |
+
for tool_call in response_message.tool_calls:
|
| 492 |
+
function_name = tool_call.function.name
|
| 493 |
+
function_args = json.loads(tool_call.function.arguments)
|
| 494 |
+
|
| 495 |
+
# Execute the function
|
| 496 |
+
if function_name in AVAILABLE_FUNCTIONS:
|
| 497 |
+
if function_name == "find_relevant_standards":
|
| 498 |
+
result = AVAILABLE_FUNCTIONS[function_name](
|
| 499 |
+
activity=function_args.get("activity", ""),
|
| 500 |
+
max_results=function_args.get("max_results", 5),
|
| 501 |
+
grade=function_args.get("grade"),
|
| 502 |
+
subject=function_args.get("subject")
|
| 503 |
+
)
|
| 504 |
+
elif function_name == "get_standard_details":
|
| 505 |
+
result = AVAILABLE_FUNCTIONS[function_name](
|
| 506 |
+
standard_id=function_args.get("standard_id", "")
|
| 507 |
+
)
|
| 508 |
+
else:
|
| 509 |
+
result = json.dumps({"error": f"Unknown function: {function_name}"})
|
| 510 |
+
else:
|
| 511 |
+
result = json.dumps({"error": f"Function {function_name} not available"})
|
| 512 |
+
|
| 513 |
+
# Add function result to messages
|
| 514 |
+
full_messages.append({
|
| 515 |
+
"role": "tool",
|
| 516 |
+
"tool_call_id": tool_call.id,
|
| 517 |
+
"name": function_name,
|
| 518 |
+
"content": result,
|
| 519 |
+
})
|
| 520 |
+
|
| 521 |
+
# Get final response with function results
|
| 522 |
+
final_response = client.chat.completions.create(
|
| 523 |
+
model="Qwen/Qwen2.5-7B-Instruct",
|
| 524 |
+
messages=full_messages,
|
| 525 |
+
temperature=0.7,
|
| 526 |
+
max_tokens=1000,
|
| 527 |
+
)
|
| 528 |
+
|
| 529 |
+
return final_response.choices[0].message.content
|
| 530 |
+
else:
|
| 531 |
+
# No tool calls, return direct response
|
| 532 |
+
return response_message.content
|
| 533 |
+
|
| 534 |
+
except Exception as e:
|
| 535 |
+
# Error handling
|
| 536 |
+
return f"I apologize, but I encountered an error: {str(e)}. Please try again or rephrase your question."
|
| 537 |
+
```
|
| 538 |
+
|
| 539 |
+
**Key Implementation Points:**
|
| 540 |
+
|
| 541 |
+
1. **Direct Function Calls**: Since the MCP tools are in the same Python process, we call the underlying implementation functions (`find_relevant_standards_impl` and `get_standard_details_impl`) directly rather than making HTTP requests to the MCP server endpoint.
|
| 542 |
+
|
| 543 |
+
2. **Tool Schema Conversion**: The MCP tools are converted to OpenAI function calling format, which is what `InferenceClient` expects. The schemas match the function signatures and docstrings.
|
| 544 |
+
|
| 545 |
+
3. **Tool Calling Workflow**:
|
| 546 |
+
|
| 547 |
+
- First API call includes tools and lets model decide if/when to call them
|
| 548 |
+
- If model requests tool calls, execute them and add results to conversation
|
| 549 |
+
- Second API call generates final response incorporating tool results
|
| 550 |
+
|
| 551 |
+
4. **Error Handling**: All errors are caught and returned as user-friendly messages.
|
| 552 |
+
|
| 553 |
+
5. **Model Configuration**: Uses `Qwen/Qwen2.5-7B-Instruct` via Together AI provider with `tool_choice="auto"` to let the model decide when to use tools.
|
| 554 |
+
|
| 555 |
+
6. **Gradio 6 History Format**: The chat function handles Gradio 6's structured content format where content is a list of typed blocks (e.g., `[{"type": "text", "text": "..."}]`) rather than simple strings.
|
| 556 |
+
|
| 557 |
+
### Hugging Face Space Configuration
|
| 558 |
+
|
| 559 |
+
**CRITICAL: Space must be created in the MCP-1st-Birthday organization**
|
| 560 |
+
|
| 561 |
+
**Required Files:**
|
| 562 |
+
|
| 563 |
+
1. **`app.py`**: Main Gradio application entry point (as described above)
|
| 564 |
+
|
| 565 |
+
2. **`requirements.txt`**: Python dependencies
|
| 566 |
+
|
| 567 |
+
- Extract from `pyproject.toml` or manually specify
|
| 568 |
+
- Must include: `gradio[mcp]>=6.0.0`, `pinecone`, `python-dotenv`, `pydantic>=2.0.0`, `pydantic-settings>=2.0.0`, `huggingface_hub`
|
| 569 |
+
- The `[mcp]` extra ensures all MCP dependencies are included
|
| 570 |
+
- `huggingface_hub` is required for Inference API access in the chat interface
|
| 571 |
+
|
| 572 |
+
3. **`README.md`**: Updated with hackathon requirements (see README Requirements section below)
|
| 573 |
+
|
| 574 |
+
4. **`.env.example`**: Template for environment variables
|
| 575 |
+
|
| 576 |
+
```
|
| 577 |
+
PINECONE_API_KEY=your_api_key_here
|
| 578 |
+
PINECONE_INDEX_NAME=common-core-standards
|
| 579 |
+
PINECONE_NAMESPACE=standards
|
| 580 |
+
HF_TOKEN=your_huggingface_token_here
|
| 581 |
+
# Note: MCP_SERVER_URL is not needed since we call functions directly
|
| 582 |
+
```
|
| 583 |
+
|
| 584 |
+
5. **Space Configuration** (via Hugging Face UI):
|
| 585 |
+
- **Organization**: Must be `MCP-1st-Birthday` (create Space in this organization)
|
| 586 |
+
- SDK: `gradio`
|
| 587 |
+
- Python version: 3.12+
|
| 588 |
+
- Environment variables: Set `PINECONE_API_KEY`, `HF_TOKEN`, and other required variables in Space settings
|
| 589 |
+
- **Visibility**: Can be public or private (both work for MCP servers)
|
| 590 |
+
|
| 591 |
+
### Hackathon Registration and Submission Requirements
|
| 592 |
+
|
| 593 |
+
**Before Building:**
|
| 594 |
+
|
| 595 |
+
1. **Join the Organization** (REQUIRED):
|
| 596 |
+
|
| 597 |
+
- Go to https://huggingface.co/MCP-1st-Birthday
|
| 598 |
+
- Click "Request to join this org" (top right)
|
| 599 |
+
- Wait for approval (usually automatic or quick)
|
| 600 |
+
|
| 601 |
+
2. **Complete Registration** (REQUIRED):
|
| 602 |
+
|
| 603 |
+
- Complete the official registration form (linked on the hackathon page)
|
| 604 |
+
- Registration link: Available on the hackathon page
|
| 605 |
+
|
| 606 |
+
3. **Team Members** (if applicable):
|
| 607 |
+
- If working in a team (2-5 people), **all** members must:
|
| 608 |
+
- Join the MCP-1st-Birthday organization individually
|
| 609 |
+
- Complete the registration form individually
|
| 610 |
+
- Be listed in the README with their Hugging Face usernames
|
| 611 |
+
|
| 612 |
+
**Submission Requirements (All Must Be Completed by November 30, 2025, 11:59 PM UTC):**
|
| 613 |
+
|
| 614 |
+
1. **Hugging Face Space** (REQUIRED):
|
| 615 |
+
|
| 616 |
+
- Space must be in the `MCP-1st-Birthday` organization
|
| 617 |
+
- Space must be functional and accessible
|
| 618 |
+
- Code must be pushed to the Space repository
|
| 619 |
+
|
| 620 |
+
2. **README.md** (REQUIRED):
|
| 621 |
+
|
| 622 |
+
- Must include track tag: `building-mcp-track-consumer`
|
| 623 |
+
- Must include team member usernames (if team)
|
| 624 |
+
- Must include demo video link
|
| 625 |
+
- Must include social media post link
|
| 626 |
+
- Must include clear documentation (see README Requirements section)
|
| 627 |
+
|
| 628 |
+
3. **Demo Video** (REQUIRED):
|
| 629 |
+
|
| 630 |
+
- **Length:** 1-5 minutes
|
| 631 |
+
- **Content:** Must show the MCP server in action, specifically demonstrating:
|
| 632 |
+
- Integration with an MCP client (Claude Desktop, Cursor, or similar)
|
| 633 |
+
- The MCP tools being used through the client
|
| 634 |
+
- The Gradio web interface
|
| 635 |
+
- The chat interface using MCP tools
|
| 636 |
+
- **Hosting:** YouTube, Vimeo, or similar platform
|
| 637 |
+
- **Link:** Must be included in the README
|
| 638 |
+
|
| 639 |
+
4. **Social Media Post** (REQUIRED):
|
| 640 |
+
|
| 641 |
+
- Post about your project on X/Twitter, LinkedIn, or similar
|
| 642 |
+
- Include information about the project and hackathon
|
| 643 |
+
- **Link:** Must be included in the README (not just submission form)
|
| 644 |
+
|
| 645 |
+
5. **Functionality Requirements**:
|
| 646 |
+
- Working MCP server (exposed via Gradio)
|
| 647 |
+
- Integration with MCP client (demonstrated in video)
|
| 648 |
+
- Published as Hugging Face Space
|
| 649 |
+
|
| 650 |
+
**Judging Criteria (To Guide Implementation):**
|
| 651 |
+
|
| 652 |
+
Projects will be evaluated on:
|
| 653 |
+
|
| 654 |
+
1. **Completeness**: Space, video, documentation, and social link all present
|
| 655 |
+
2. **Functionality**: Works effectively, uses Gradio 6 and MCP features
|
| 656 |
+
3. **Real-world Impact**: Useful tool with potential for real-world application
|
| 657 |
+
4. **Creativity**: Innovative or original idea and implementation
|
| 658 |
+
5. **Design/UI-UX**: Polished, intuitive, and easy to use
|
| 659 |
+
6. **Documentation**: Well-communicated in README and/or demo video
|
| 660 |
+
|
| 661 |
+
**Additional Considerations for Judging:**
|
| 662 |
+
|
| 663 |
+
- **Community Choice Award**: Based on social media engagement, Space interactions (Discussions tab), and Discord community engagement
|
| 664 |
+
- **Gradio 6 Features**: Use of Gradio 6 capabilities (MCP server, ChatInterface, etc.)
|
| 665 |
+
- **MCP Integration**: Effective use of MCP protocol and tool exposure
|
| 666 |
+
|
| 667 |
+
### README Requirements
|
| 668 |
+
|
| 669 |
+
**File: `README.md`**
|
| 670 |
+
|
| 671 |
+
The README is a critical component of the hackathon submission and must include all required elements. Follow this structure:
|
| 672 |
+
|
| 673 |
+
#### 1. **Hackathon Track Tag (REQUIRED)**
|
| 674 |
+
|
| 675 |
+
**Must be included in the README metadata or prominently at the top:**
|
| 676 |
+
|
| 677 |
+
Add the track tag `building-mcp-track-consumer` to classify this as a Consumer MCP Server entry. This tag is **mandatory** for submission eligibility.
|
| 678 |
+
|
| 679 |
+
**Placement options:**
|
| 680 |
+
|
| 681 |
+
- In the README frontmatter (if using YAML frontmatter)
|
| 682 |
+
- As a tag/badge at the top of the README
|
| 683 |
+
- In a "Hackathon" or "Submission" section
|
| 684 |
+
|
| 685 |
+
**Example:**
|
| 686 |
+
|
| 687 |
+
```markdown
|
| 688 |
+
---
|
| 689 |
+
tags:
|
| 690 |
+
- building-mcp-track-consumer
|
| 691 |
+
- mcp
|
| 692 |
+
- gradio
|
| 693 |
+
- education
|
| 694 |
+
---
|
| 695 |
+
```
|
| 696 |
+
|
| 697 |
+
Or as a badge:
|
| 698 |
+
|
| 699 |
+
```markdown
|
| 700 |
+

|
| 701 |
+
```
|
| 702 |
+
|
| 703 |
+
#### 2. **Project Title and Description**
|
| 704 |
+
|
| 705 |
+
Clear, compelling explanation of:
|
| 706 |
+
|
| 707 |
+
- What the MCP server does
|
| 708 |
+
- Its purpose and capabilities
|
| 709 |
+
- Why it's useful for consumers (teachers, students, parents, etc.)
|
| 710 |
+
- Key features and benefits
|
| 711 |
+
|
| 712 |
+
#### 3. **Team Information (If Applicable)**
|
| 713 |
+
|
| 714 |
+
**If working in a team (2-5 members):**
|
| 715 |
+
|
| 716 |
+
- Include Hugging Face usernames of **all** team members
|
| 717 |
+
- Format: "Built by @username1, @username2, @username3"
|
| 718 |
+
- All team members must be members of the MCP-1st-Birthday organization
|
| 719 |
+
|
| 720 |
+
**If working solo:**
|
| 721 |
+
|
| 722 |
+
- Optional: Include your Hugging Face username
|
| 723 |
+
- Format: "Built by @username"
|
| 724 |
+
|
| 725 |
+
#### 4. **Usage Instructions**
|
| 726 |
+
|
| 727 |
+
**A. Gradio Web Interface:**
|
| 728 |
+
|
| 729 |
+
- How to use the web interface
|
| 730 |
+
- What each tab/component does
|
| 731 |
+
- Example queries or use cases
|
| 732 |
+
|
| 733 |
+
**B. MCP Client Integration (REQUIRED for demo video):**
|
| 734 |
+
|
| 735 |
+
- How to connect an MCP client (Claude Desktop, Cursor, etc.) to the Space
|
| 736 |
+
- MCP server URL: `https://your-space-name.hf.space/gradio_api/mcp/`
|
| 737 |
+
- Step-by-step configuration instructions
|
| 738 |
+
- Example MCP client configuration:
|
| 739 |
+
```json
|
| 740 |
+
{
|
| 741 |
+
"mcpServers": {
|
| 742 |
+
"common-core": {
|
| 743 |
+
"url": "https://your-space-name.hf.space/gradio_api/mcp/"
|
| 744 |
+
}
|
| 745 |
+
}
|
| 746 |
+
}
|
| 747 |
+
```
|
| 748 |
+
- Screenshots of the MCP client showing the tools available
|
| 749 |
+
|
| 750 |
+
#### 5. **Setup Instructions**
|
| 751 |
+
|
| 752 |
+
- Local development setup (if applicable)
|
| 753 |
+
- Environment variables needed
|
| 754 |
+
- Installation steps
|
| 755 |
+
- How to run locally
|
| 756 |
+
|
| 757 |
+
#### 6. **Visual Documentation**
|
| 758 |
+
|
| 759 |
+
- **Screenshots or GIFs** of the interface in action
|
| 760 |
+
- Show the Gradio web interface
|
| 761 |
+
- Show MCP client integration (if possible)
|
| 762 |
+
- Demonstrate key features
|
| 763 |
+
|
| 764 |
+
#### 7. **Demo Video Link (REQUIRED)**
|
| 765 |
+
|
| 766 |
+
**Must include a link to a demo video that:**
|
| 767 |
+
|
| 768 |
+
- **Length:** 1-5 minutes
|
| 769 |
+
- **Content Requirements:**
|
| 770 |
+
- Shows the MCP server **in action**
|
| 771 |
+
- **Specifically demonstrates integration with an MCP client** (Claude Desktop, Cursor, or similar)
|
| 772 |
+
- Shows the MCP tools being used through the client
|
| 773 |
+
- Demonstrates the Gradio web interface
|
| 774 |
+
- Shows the chat interface using MCP tools
|
| 775 |
+
- **Platform:** YouTube, Vimeo, or other video hosting service
|
| 776 |
+
- **Format:** Include the video link prominently in the README
|
| 777 |
+
|
| 778 |
+
**Example section:**
|
| 779 |
+
|
| 780 |
+
```markdown
|
| 781 |
+
## 🎥 Demo Video
|
| 782 |
+
|
| 783 |
+
Watch the demo video showing the MCP server in action:
|
| 784 |
+
|
| 785 |
+
[](video-url)
|
| 786 |
+
|
| 787 |
+
The video demonstrates:
|
| 788 |
+
|
| 789 |
+
- MCP server integration with Claude Desktop
|
| 790 |
+
- Using the Gradio web interface
|
| 791 |
+
- Chat interface with tool calling
|
| 792 |
+
```
|
| 793 |
+
|
| 794 |
+
#### 8. **Social Media Post Link (REQUIRED)**
|
| 795 |
+
|
| 796 |
+
**Must include a link to a social media post about the project:**
|
| 797 |
+
|
| 798 |
+
- Platform: X/Twitter, LinkedIn, or similar
|
| 799 |
+
- Content: Post about your project, the hackathon, and what you built
|
| 800 |
+
- **This link must be included in the README** (not just in submission form)
|
| 801 |
+
- Format: "Share on [Twitter](link) | [LinkedIn](link)"
|
| 802 |
+
|
| 803 |
+
**Example section:**
|
| 804 |
+
|
| 805 |
+
```markdown
|
| 806 |
+
## 📱 Social Media
|
| 807 |
+
|
| 808 |
+
Check out our project announcement:
|
| 809 |
+
|
| 810 |
+
- [Twitter/X Post](your-twitter-post-url)
|
| 811 |
+
- [LinkedIn Post](your-linkedin-post-url)
|
| 812 |
+
```
|
| 813 |
+
|
| 814 |
+
#### 9. **Technical Details**
|
| 815 |
+
|
| 816 |
+
- Architecture overview
|
| 817 |
+
- Technologies used (Gradio 6, MCP, etc.)
|
| 818 |
+
- How the MCP tools work
|
| 819 |
+
- API documentation (if applicable)
|
| 820 |
+
|
| 821 |
+
#### 10. **Acknowledgments**
|
| 822 |
+
|
| 823 |
+
- Hackathon organizers
|
| 824 |
+
- Libraries and tools used
|
| 825 |
+
- Any inspiration or references
|
| 826 |
+
|
| 827 |
+
**README Checklist for Hackathon Submission:**
|
| 828 |
+
|
| 829 |
+
- [ ] Track tag `building-mcp-track-consumer` included
|
| 830 |
+
- [ ] Team member usernames listed (if team)
|
| 831 |
+
- [ ] Clear project description
|
| 832 |
+
- [ ] Usage instructions for web interface
|
| 833 |
+
- [ ] MCP client integration instructions
|
| 834 |
+
- [ ] Screenshots/GIFs included
|
| 835 |
+
- [ ] Demo video link included (1-5 minutes, shows MCP client integration)
|
| 836 |
+
- [ ] Social media post link included
|
| 837 |
+
- [ ] Setup/installation instructions
|
| 838 |
+
- [ ] Technical details documented
|
| 839 |
+
|
| 840 |
+
### File Changes Summary
|
| 841 |
+
|
| 842 |
+
**Files to Create:**
|
| 843 |
+
|
| 844 |
+
- `app.py`: Main Gradio application with MCP server and chat interface
|
| 845 |
+
- `requirements.txt`: Python dependencies for Hugging Face Space
|
| 846 |
+
- `.env.example`: Environment variable template
|
| 847 |
+
- `README.md`: Updated with hackathon requirements (or update existing)
|
| 848 |
+
|
| 849 |
+
**Files to Delete:**
|
| 850 |
+
|
| 851 |
+
- `server.py`: Replaced by `app.py`
|
| 852 |
+
|
| 853 |
+
**Files to Modify:**
|
| 854 |
+
|
| 855 |
+
- `pyproject.toml`: Update Gradio to `gradio[mcp]>=6.0.0`, add `huggingface_hub`, remove standalone `mcp` dependency if present
|
| 856 |
+
|
| 857 |
+
**Files to Reference (Existing):**
|
| 858 |
+
|
| 859 |
+
- `src/search.py`: Contains `find_relevant_standards_impl()` function
|
| 860 |
+
- `src/lookup.py`: Contains `get_standard_details_impl()` function
|
| 861 |
+
- `src/pinecone_client.py`: Pinecone client implementation
|
| 862 |
+
- `src/mcp_config.py`: Configuration settings
|
| 863 |
+
|
| 864 |
+
## Technical Specifications (Verified from Documentation)
|
| 865 |
+
|
| 866 |
+
### Gradio MCP Server Syntax
|
| 867 |
+
|
| 868 |
+
**Enabling MCP Server:**
|
| 869 |
+
|
| 870 |
+
```python
|
| 871 |
+
demo.launch(mcp_server=True)
|
| 872 |
+
```
|
| 873 |
+
|
| 874 |
+
Or via environment variable:
|
| 875 |
+
|
| 876 |
+
```bash
|
| 877 |
+
export GRADIO_MCP_SERVER=True
|
| 878 |
+
```
|
| 879 |
+
|
| 880 |
+
**MCP Server Endpoints:**
|
| 881 |
+
|
| 882 |
+
- Schema: `{base_url}/gradio_api/mcp/schema`
|
| 883 |
+
- SSE Endpoint: `{base_url}/gradio_api/mcp/` (for MCP clients)
|
| 884 |
+
|
| 885 |
+
### Function Signature Requirements
|
| 886 |
+
|
| 887 |
+
Functions exposed as MCP tools must:
|
| 888 |
+
|
| 889 |
+
1. Have type hints for all parameters
|
| 890 |
+
2. Have detailed docstrings with Args and Returns sections
|
| 891 |
+
3. Return a value (not None, unless explicitly typed as `str | None`)
|
| 892 |
+
|
| 893 |
+
**Example:**
|
| 894 |
+
|
| 895 |
+
```python
|
| 896 |
+
def find_relevant_standards(
|
| 897 |
+
activity: str,
|
| 898 |
+
max_results: int = 5,
|
| 899 |
+
grade: str | None = None,
|
| 900 |
+
subject: str | None = None,
|
| 901 |
+
) -> str:
|
| 902 |
+
"""
|
| 903 |
+
Returns educational standards relevant to the activity.
|
| 904 |
+
|
| 905 |
+
Args:
|
| 906 |
+
activity: Natural language description of the learning activity
|
| 907 |
+
max_results: Maximum number of standards to return (default: 5)
|
| 908 |
+
grade: Optional grade level filter (e.g., "K", "01", "05", "09")
|
| 909 |
+
subject: Optional subject filter (e.g., "Mathematics", "ELA-Literacy")
|
| 910 |
+
|
| 911 |
+
Returns:
|
| 912 |
+
JSON string with structured response containing matching standards
|
| 913 |
+
"""
|
| 914 |
+
# Implementation
|
| 915 |
+
```
|
| 916 |
+
|
| 917 |
+
### Repository Structure for Hugging Face Spaces
|
| 918 |
+
|
| 919 |
+
**Required Files:**
|
| 920 |
+
|
| 921 |
+
- `app.py` (or `main.py`): Entry point for the Gradio app
|
| 922 |
+
- `requirements.txt`: Python dependencies
|
| 923 |
+
- `README.md`: Project documentation
|
| 924 |
+
|
| 925 |
+
**Optional but Recommended:**
|
| 926 |
+
|
| 927 |
+
- `.env.example`: Environment variable template
|
| 928 |
+
- `src/`: Source code directory (already exists)
|
| 929 |
+
|
| 930 |
+
**Space Configuration:**
|
| 931 |
+
|
| 932 |
+
- SDK: Set to `gradio` in Hugging Face Space settings
|
| 933 |
+
- Python version: 3.12+ (matches project requirement)
|
| 934 |
+
- Environment variables: Configure in Space settings UI
|
| 935 |
+
|
| 936 |
+
### Exposing Functions as MCP Endpoints
|
| 937 |
+
|
| 938 |
+
**Automatic Conversion:**
|
| 939 |
+
|
| 940 |
+
- Any function passed to `gr.Interface()` or `gr.ChatInterface()` is automatically exposed as an MCP tool
|
| 941 |
+
- Function name becomes the tool name
|
| 942 |
+
- Docstring becomes the tool description
|
| 943 |
+
- Type hints define the parameter schema
|
| 944 |
+
|
| 945 |
+
**API Name Customization:**
|
| 946 |
+
|
| 947 |
+
```python
|
| 948 |
+
gr.Interface(
|
| 949 |
+
fn=find_relevant_standards,
|
| 950 |
+
# ... inputs and outputs ...
|
| 951 |
+
api_name="find_relevant_standards", # Custom API endpoint name
|
| 952 |
+
)
|
| 953 |
+
```
|
| 954 |
+
|
| 955 |
+
**API Visibility Control:**
|
| 956 |
+
|
| 957 |
+
```python
|
| 958 |
+
gr.Interface(
|
| 959 |
+
fn=find_relevant_standards,
|
| 960 |
+
# ... inputs and outputs ...
|
| 961 |
+
api_visibility="public", # "public", "private", or "undocumented"
|
| 962 |
+
)
|
| 963 |
+
```
|
| 964 |
+
|
| 965 |
+
**API Description Customization:**
|
| 966 |
+
|
| 967 |
+
```python
|
| 968 |
+
gr.Interface(
|
| 969 |
+
fn=find_relevant_standards,
|
| 970 |
+
# ... inputs and outputs ...
|
| 971 |
+
api_description="Custom description for MCP tool", # Overrides docstring
|
| 972 |
+
)
|
| 973 |
+
```
|
| 974 |
+
|
| 975 |
+
## Chat Interface MCP Integration
|
| 976 |
+
|
| 977 |
+
The chat interface must:
|
| 978 |
+
|
| 979 |
+
1. Use a Hugging Face model that supports tool calling (e.g., `Qwen/Qwen2.5-7B-Instruct`)
|
| 980 |
+
2. Specify an inference provider (e.g., `provider="together"`) for the model
|
| 981 |
+
3. Handle Gradio 6's structured content format for chat history
|
| 982 |
+
4. Handle tool calling: detect tool requests, execute functions directly, return results to model
|
| 983 |
+
|
| 984 |
+
**Implementation Notes:**
|
| 985 |
+
|
| 986 |
+
- The chat interface and MCP tools are in the same Gradio app
|
| 987 |
+
- We call the underlying Python functions directly rather than making HTTP requests to the MCP server
|
| 988 |
+
- The model must be configured to call tools when answering questions about educational standards
|
| 989 |
+
- **Gradio 6 History Format**: Content is now structured as `[{"type": "text", "text": "..."}]` rather than simple strings. The chat function must extract text from these content blocks.
|
| 990 |
+
|
| 991 |
+
## Testing and Validation
|
| 992 |
+
|
| 993 |
+
### Local Testing
|
| 994 |
+
|
| 995 |
+
1. Run `app.py` locally:
|
| 996 |
+
|
| 997 |
+
```bash
|
| 998 |
+
python app.py
|
| 999 |
+
```
|
| 1000 |
+
|
| 1001 |
+
2. Verify MCP server is running:
|
| 1002 |
+
|
| 1003 |
+
- Check console output for MCP server URL
|
| 1004 |
+
- Visit `http://localhost:7860/gradio_api/mcp/schema` to view tools
|
| 1005 |
+
|
| 1006 |
+
3. Test MCP client connection:
|
| 1007 |
+
|
| 1008 |
+
- Configure Claude Desktop or Cursor to use `http://localhost:7860/gradio_api/mcp/`
|
| 1009 |
+
- Verify tools appear in the client
|
| 1010 |
+
|
| 1011 |
+
4. Test chat interface:
|
| 1012 |
+
- Interact with the chat interface in the Gradio UI
|
| 1013 |
+
- Verify it can call MCP tools and return educational standards information
|
| 1014 |
+
|
| 1015 |
+
### Hugging Face Space Deployment
|
| 1016 |
+
|
| 1017 |
+
1. Push code to Hugging Face Space
|
| 1018 |
+
2. Verify Space builds and runs successfully
|
| 1019 |
+
3. Check MCP server endpoint: `https://your-space-name.hf.space/gradio_api/mcp/schema`
|
| 1020 |
+
4. Test MCP client connection using the Space URL
|
| 1021 |
+
5. Test chat interface in the Space UI
|
| 1022 |
+
|
| 1023 |
+
## Risks and Assumptions
|
| 1024 |
+
|
| 1025 |
+
### Risks
|
| 1026 |
+
|
| 1027 |
+
1. **Chat Interface Complexity**: Implementing MCP tool calling with Hugging Face Inference API may be complex and require additional research or libraries.
|
| 1028 |
+
|
| 1029 |
+
2. **Model/Provider Availability**: The selected model (`Qwen/Qwen2.5-7B-Instruct`) requires an inference provider (Together AI or Featherless AI). Provider availability and rate limits may affect performance.
|
| 1030 |
+
|
| 1031 |
+
3. **MCP Client Configuration**: Users may need guidance on configuring MCP clients to connect to the Space.
|
| 1032 |
+
|
| 1033 |
+
4. **Gradio 6 Breaking Changes**: Gradio 6 introduces several breaking changes including structured content format for ChatInterface history. Implementation must handle these changes correctly.
|
| 1034 |
+
|
| 1035 |
+
### Assumptions
|
| 1036 |
+
|
| 1037 |
+
1. Gradio 6.0.0+ includes all necessary MCP server functionality without additional packages (with `gradio[mcp]` extras).
|
| 1038 |
+
|
| 1039 |
+
2. The existing `src/search.py` and `src/lookup.py` implementations can be directly called from Gradio functions without modification.
|
| 1040 |
+
|
| 1041 |
+
3. Hugging Face Spaces automatically sets `GRADIO_MCP_SERVER=True` when Gradio 6+ is detected.
|
| 1042 |
+
|
| 1043 |
+
4. The MCP server URL format for Hugging Face Spaces is `https://space-name.hf.space/gradio_api/mcp/`.
|
| 1044 |
+
|
| 1045 |
+
5. The `Qwen/Qwen2.5-7B-Instruct` model is available via Together AI or Featherless AI inference providers and supports tool calling.
|
| 1046 |
+
|
| 1047 |
+
6. Gradio 6 ChatInterface passes history in structured content format that must be parsed to extract text content.
|
| 1048 |
+
|
| 1049 |
+
## Dependencies
|
| 1050 |
+
|
| 1051 |
+
- **Gradio 6.0.0+**: Required for MCP server support (install with `gradio[mcp]` extras)
|
| 1052 |
+
- **Hugging Face Hub/Inference API**: For chat interface model access
|
| 1053 |
+
- Requires `provider="together"` or similar for models that need inference providers
|
| 1054 |
+
- `Qwen/Qwen2.5-7B-Instruct` is available via Together AI and Featherless AI providers
|
| 1055 |
+
- **Existing dependencies**: Pinecone, pydantic, etc. (unchanged)
|
| 1056 |
+
|
| 1057 |
+
## Deliverables
|
| 1058 |
+
|
| 1059 |
+
1. ✅ `app.py`: Gradio application with MCP server and chat interface
|
| 1060 |
+
2. ✅ `requirements.txt`: Dependencies for Hugging Face Space
|
| 1061 |
+
3. ✅ Updated `README.md`: Hackathon-compliant documentation with all required elements
|
| 1062 |
+
4. ✅ `.env.example`: Environment variable template
|
| 1063 |
+
5. ✅ Updated `pyproject.toml`: Gradio 6.0.0+ dependency with MCP extras
|
| 1064 |
+
6. ✅ Deleted `server.py`: Old FastMCP implementation removed
|
| 1065 |
+
7. ✅ Working Hugging Face Space: Deployed in MCP-1st-Birthday organization
|
| 1066 |
+
8. ✅ Chat interface: Functional with MCP tool calling
|
| 1067 |
+
9. ✅ Demo video: 1-5 minutes showing MCP client integration
|
| 1068 |
+
10. ✅ Social media post: Link included in README
|
| 1069 |
+
|
| 1070 |
+
## Hackathon Submission Checklist
|
| 1071 |
+
|
| 1072 |
+
**Before Submission Deadline (November 30, 2025, 11:59 PM UTC):**
|
| 1073 |
+
|
| 1074 |
+
### Registration (Complete Before Building)
|
| 1075 |
+
|
| 1076 |
+
- [ ] Joined MCP-1st-Birthday organization on Hugging Face
|
| 1077 |
+
- [ ] Completed official registration form
|
| 1078 |
+
- [ ] All team members joined organization and registered (if team)
|
| 1079 |
+
|
| 1080 |
+
### Technical Implementation
|
| 1081 |
+
|
| 1082 |
+
- [ ] `app.py` created with Gradio MCP server
|
| 1083 |
+
- [ ] `requirements.txt` includes all dependencies
|
| 1084 |
+
- [ ] Space deployed and functional in MCP-1st-Birthday organization
|
| 1085 |
+
- [ ] MCP server accessible at `/gradio_api/mcp/` endpoint
|
| 1086 |
+
- [ ] Chat interface working with tool calling
|
| 1087 |
+
- [ ] All MCP tools (`find_relevant_standards`, `get_standard_details`) functional
|
| 1088 |
+
|
| 1089 |
+
### Documentation (README.md)
|
| 1090 |
+
|
| 1091 |
+
- [ ] Track tag `building-mcp-track-consumer` included
|
| 1092 |
+
- [ ] Team member usernames listed (if team)
|
| 1093 |
+
- [ ] Clear project description and purpose
|
| 1094 |
+
- [ ] Usage instructions for Gradio web interface
|
| 1095 |
+
- [ ] MCP client integration instructions with configuration example
|
| 1096 |
+
- [ ] Setup/installation instructions
|
| 1097 |
+
- [ ] Screenshots or GIFs included
|
| 1098 |
+
- [ ] **Demo video link included** (1-5 minutes, shows MCP client integration)
|
| 1099 |
+
- [ ] **Social media post link included**
|
| 1100 |
+
- [ ] Technical details documented
|
| 1101 |
+
|
| 1102 |
+
### Demo Video Requirements
|
| 1103 |
+
|
| 1104 |
+
- [ ] Video length: 1-5 minutes
|
| 1105 |
+
- [ ] Shows MCP server in action
|
| 1106 |
+
- [ ] **Demonstrates integration with MCP client** (Claude Desktop, Cursor, etc.)
|
| 1107 |
+
- [ ] Shows MCP tools being used through the client
|
| 1108 |
+
- [ ] Shows Gradio web interface
|
| 1109 |
+
- [ ] Shows chat interface using MCP tools
|
| 1110 |
+
- [ ] Video hosted on YouTube, Vimeo, or similar
|
| 1111 |
+
- [ ] Link included in README
|
| 1112 |
+
|
| 1113 |
+
### Social Media Post
|
| 1114 |
+
|
| 1115 |
+
- [ ] Post created on X/Twitter, LinkedIn, or similar
|
| 1116 |
+
- [ ] Post mentions the project and hackathon
|
| 1117 |
+
- [ ] Link included in README (not just submission form)
|
| 1118 |
+
|
| 1119 |
+
### Space Configuration
|
| 1120 |
+
|
| 1121 |
+
- [ ] Space created in `MCP-1st-Birthday` organization
|
| 1122 |
+
- [ ] SDK set to `gradio`
|
| 1123 |
+
- [ ] Python version 3.12+
|
| 1124 |
+
- [ ] Environment variables configured (PINECONE_API_KEY, HF_TOKEN, etc.)
|
| 1125 |
+
- [ ] Space is accessible and functional
|
| 1126 |
+
|
| 1127 |
+
### Quality Checks
|
| 1128 |
+
|
| 1129 |
+
- [ ] Code follows best practices
|
| 1130 |
+
- [ ] Error handling implemented
|
| 1131 |
+
- [ ] UI is polished and intuitive
|
| 1132 |
+
- [ ] Documentation is clear and complete
|
| 1133 |
+
- [ ] All features work as expected
|
| 1134 |
+
|
| 1135 |
+
## Next Steps After Sprint
|
| 1136 |
+
|
| 1137 |
+
1. **Create Demo Video**:
|
| 1138 |
+
|
| 1139 |
+
- Record 1-5 minute video showing MCP client integration
|
| 1140 |
+
- Demonstrate all key features
|
| 1141 |
+
- Upload to YouTube or Vimeo
|
| 1142 |
+
- Add link to README
|
| 1143 |
+
|
| 1144 |
+
2. **Create Social Media Post**:
|
| 1145 |
+
|
| 1146 |
+
- Post about the project on X/Twitter or LinkedIn
|
| 1147 |
+
- Include project highlights and hackathon information
|
| 1148 |
+
- Add link to README
|
| 1149 |
+
|
| 1150 |
+
3. **Final README Polish**:
|
| 1151 |
+
|
| 1152 |
+
- Ensure all required elements are present
|
| 1153 |
+
- Add screenshots/GIFs
|
| 1154 |
+
- Verify all links work
|
| 1155 |
+
- Check formatting and clarity
|
| 1156 |
+
|
| 1157 |
+
4. **Submit to Hackathon**:
|
| 1158 |
+
|
| 1159 |
+
- Verify all checklist items are complete
|
| 1160 |
+
- Submit before November 30, 2025, 11:59 PM UTC
|
| 1161 |
+
- Engage with community (Discord, Space discussions)
|
| 1162 |
+
|
| 1163 |
+
5. **Future Enhancements** (Post-Hackathon):
|
| 1164 |
+
- Fine-tune chat interface model selection and configuration
|
| 1165 |
+
- Add error handling and user feedback improvements
|
| 1166 |
+
- Consider adding more MCP tools or resources
|
| 1167 |
+
- Optimize performance and user experience
|
.agent/specs/003_gradio/tasks.md
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Spec Tasks
|
| 2 |
+
|
| 3 |
+
## Tasks
|
| 4 |
+
|
| 5 |
+
- [x] 1. Update Dependencies in pyproject.toml
|
| 6 |
+
|
| 7 |
+
- [x] 1.1 Update Gradio dependency from `gradio>=5.0.0,<6.0.0` to `gradio[mcp]>=6.0.0` to enable MCP server support
|
| 8 |
+
- [x] 1.2 Add `huggingface_hub` to dependencies list for Inference API access in chat interface
|
| 9 |
+
- [x] 1.3 Remove standalone `mcp` package dependency (FastMCP is no longer used, Gradio 6 includes MCP support)
|
| 10 |
+
- [x] 1.4 Verify all other dependencies remain unchanged (pinecone, python-dotenv, typer, requests, rich, loguru, pydantic>=2.0.0, pydantic-settings>=2.0.0)
|
| 11 |
+
|
| 12 |
+
- [x] 2. Create app.py with MCP Tool Wrapper Functions
|
| 13 |
+
|
| 14 |
+
- [x] 2.1 Create `app.py` in project root with imports: `gradio as gr`, `find_relevant_standards_impl` from `src.search`, `get_standard_details_impl` from `src.lookup`
|
| 15 |
+
- [x] 2.2 Implement `find_relevant_standards()` function with signature: `(activity: str, max_results: int = 5, grade: str | None = None, subject: str | None = None) -> str`
|
| 16 |
+
- [x] 2.3 Add comprehensive docstring to `find_relevant_standards()` following spec format with Args and Returns sections, including all grade level codes and subject examples
|
| 17 |
+
- [x] 2.4 Add input handling: convert empty string `grade` and `subject` to `None`, convert `max_results` float to int (Gradio Number returns float)
|
| 18 |
+
- [x] 2.5 Implement `get_standard_details()` function with signature: `(standard_id: str) -> str`
|
| 19 |
+
- [x] 2.6 Add comprehensive docstring to `get_standard_details()` following spec format with Args, Returns, and Raises sections
|
| 20 |
+
- [x] 2.7 Delegate both functions to their respective `_impl` functions from `src/` modules
|
| 21 |
+
|
| 22 |
+
- [x] 3. Create Gradio Interface Structure with TabbedInterface
|
| 23 |
+
|
| 24 |
+
- [x] 3.1 Create `gr.Interface` for `find_relevant_standards` with inputs: `gr.Textbox` (activity), `gr.Number` (max_results, min=1, max=20, value=5), `gr.Dropdown` (grade with choices including empty string), `gr.Textbox` (subject, optional)
|
| 25 |
+
- [x] 3.2 Configure `find_relevant_standards` interface with `gr.JSON` output, title "Find Relevant Standards", description, and `api_name="find_relevant_standards"`
|
| 26 |
+
- [x] 3.3 Create `gr.Interface` for `get_standard_details` with `gr.Textbox` input (standard_id) and `gr.JSON` output
|
| 27 |
+
- [x] 3.4 Configure `get_standard_details` interface with title "Get Standard Details", description, and `api_name="get_standard_details"`
|
| 28 |
+
- [x] 3.5 Create `gr.ChatInterface` with `fn=chat_with_standards` (placeholder for now), `type="messages"`, title, description, and example prompts
|
| 29 |
+
- [x] 3.6 Combine all three interfaces into `gr.TabbedInterface` with tab labels: ["Search", "Lookup", "Chat"]
|
| 30 |
+
- [x] 3.7 Add `if __name__ == "__main__": demo.launch(mcp_server=True)` to enable MCP server
|
| 31 |
+
|
| 32 |
+
- [x] 4. Implement Chat Interface with Hugging Face Inference API
|
| 33 |
+
|
| 34 |
+
- [x] 4.1 Add imports to `app.py`: `os`, `json`, `InferenceClient` from `huggingface_hub`
|
| 35 |
+
- [x] 4.2 Initialize `InferenceClient` with `provider="together"` and `token=os.environ.get("HF_TOKEN")` at module level
|
| 36 |
+
- [x] 4.3 Define `TOOLS` list with OpenAI function calling format schemas for `find_relevant_standards` and `get_standard_details` matching the function signatures
|
| 37 |
+
- [x] 4.4 Create `AVAILABLE_FUNCTIONS` dict mapping function names to their `_impl` implementations
|
| 38 |
+
- [x] 4.5 Implement `chat_with_standards(message: str, history: list) -> str` function signature
|
| 39 |
+
- [x] 4.6 Add Gradio 6 history format conversion: extract text from structured content blocks `[{"type": "text", "text": "..."}]` format
|
| 40 |
+
- [x] 4.7 Build message list with system message, converted history, and current user message
|
| 41 |
+
- [x] 4.8 Implement tool calling workflow: initial API call with tools, detect tool_calls, execute functions, add results, get final response
|
| 42 |
+
- [x] 4.9 Add error handling with try/except returning user-friendly error messages
|
| 43 |
+
- [x] 4.10 Configure API calls: model `"Qwen/Qwen2.5-7B-Instruct"`, `tool_choice="auto"`, `temperature=0.7`, `max_tokens=1000`
|
| 44 |
+
|
| 45 |
+
- [x] 5. Create requirements.txt for Hugging Face Space Deployment
|
| 46 |
+
|
| 47 |
+
- [x] 5.1 Create `requirements.txt` in project root
|
| 48 |
+
- [x] 5.2 Extract dependencies from `pyproject.toml` or manually specify: `gradio[mcp]>=6.0.0`, `pinecone`, `python-dotenv`, `pydantic>=2.0.0`, `pydantic-settings>=2.0.0`, `huggingface_hub`
|
| 49 |
+
- [x] 5.3 Ensure `[mcp]` extra is included in Gradio dependency specification
|
| 50 |
+
- [x] 5.4 Verify all required dependencies for both MCP server and chat interface are included
|
| 51 |
+
|
| 52 |
+
- [x] 6. Create .env.example Template File
|
| 53 |
+
|
| 54 |
+
- [x] 6.1 Create `.env.example` in project root
|
| 55 |
+
- [x] 6.2 Add `PINECONE_API_KEY=your_api_key_here` with comment explaining Pinecone API key requirement
|
| 56 |
+
- [x] 6.3 Add `PINECONE_INDEX_NAME=common-core-standards` with default value
|
| 57 |
+
- [x] 6.4 Add `PINECONE_NAMESPACE=standards` with default value
|
| 58 |
+
- [x] 6.5 Add `HF_TOKEN=your_huggingface_token_here` with comment explaining Hugging Face token requirement for chat interface
|
| 59 |
+
- [x] 6.6 Add comment noting that `MCP_SERVER_URL` is not needed since functions are called directly
|
| 60 |
+
|
| 61 |
+
- [x] 7. Create README.md with Code and Documentation Sections
|
| 62 |
+
|
| 63 |
+
- [x] 7.1 Create `README.md` in project root with hackathon track tag `building-mcp-track-consumer` in frontmatter or badge format
|
| 64 |
+
- [x] 7.2 Add project title and description explaining the MCP server purpose, capabilities, and target users (teachers, students, parents)
|
| 65 |
+
- [x] 7.3 Add team information section (placeholder for username if solo, or format for team members)
|
| 66 |
+
- [x] 7.4 Add "Usage Instructions" section with subsection A: Gradio Web Interface usage, tab descriptions, and example queries
|
| 67 |
+
- [x] 7.5 Add subsection B: MCP Client Integration instructions with MCP server URL format, step-by-step configuration, example JSON config, and note about screenshots
|
| 68 |
+
- [x] 7.6 Add "Setup Instructions" section with local development setup, environment variables, installation steps, and how to run locally
|
| 69 |
+
- [x] 7.7 Add "Technical Details" section with architecture overview, technologies used (Gradio 6, MCP), how MCP tools work, and API documentation references
|
| 70 |
+
- [x] 7.8 Add "Visual Documentation" section placeholder noting screenshots/GIFs should be added (but do not create actual media files)
|
| 71 |
+
- [x] 7.9 Add "Acknowledgments" section with hackathon organizers, libraries/tools used, and inspiration/references
|
| 72 |
+
- [x] 7.10 Add placeholder sections for "Demo Video" and "Social Media" with note that links will be added separately (exclude from code tasks)
|
| 73 |
+
|
| 74 |
+
- [x] 8. Delete Old FastMCP server.py File
|
| 75 |
+
- [x] 8.1 Delete `server.py` from project root (replaced by `app.py`)
|
| 76 |
+
- [x] 8.2 Verify no other files reference `server.py` that would break
|
.cursor/commands/{spec_draft.md → draft_spec.md}
RENAMED
|
@@ -1,37 +1,38 @@
|
|
| 1 |
I am working on developing a comprehensive spec document for the next development sprint.
|
| 2 |
|
| 3 |
<goal>
|
| 4 |
-
Solidify the current
|
| 5 |
|
| 6 |
The spec draft represents the rough notes and ideas for the next sprint. These notes are likely incomplete and require additional details and decisions to obtain sufficient information to move forward with the sprint.
|
| 7 |
|
| 8 |
-
READ:
|
| 9 |
</goal>
|
| 10 |
|
| 11 |
<process>
|
| 12 |
<overview>
|
| 13 |
-
|
|
|
|
| 14 |
|
| 15 |
First round: produce a response that includes Recommendations and Requests for Input. The user will reply by selecting exactly one option per Recommendation (or asking for refinement if none fit) and answering only those questions that cannot be inferred from selected options.
|
| 16 |
|
| 17 |
-
After each user response: update the
|
|
|
|
|
|
|
| 18 |
|
| 19 |
-
Repeat this back-and-forth until ambiguity is removed and the draft aligns with the requirements in `@.cursor/commands/finalize_spec.md`.
|
| 20 |
</overview>
|
| 21 |
|
| 22 |
<steps>
|
| 23 |
-
- READ the
|
| 24 |
- IDENTIFY anything in the spec draft that is confusing, conflicting, unclear, or missing. Identify important decisions that need to be made.
|
| 25 |
- REVIEW the current state of the project to fully understand how these new requirements fit into what already exists.
|
|
|
|
| 26 |
- RECOMMEND specific additions or updates to the draft spec to resolve confusion, add clarity, fill gaps, or add specificity. Recommendations may provide a single option when appropriate or multiple options when needed. Each Recommendation expects selection of one and only one option by the user.
|
| 27 |
- ASK targeted questions to acquire details, decisions, or preferences from the user.
|
| 28 |
-
- APPLY the user's selections: make minimal, localized edits to the
|
| 29 |
- REFINE: if the user rejects the provided options, revise the Recommendations based on feedback and repeat selection and apply.
|
| 30 |
</steps>
|
| 31 |
|
| 32 |
-
<end_conditions>
|
| 33 |
-
- Continue this process until the draft is unambiguous and conforms to `@.cursor/commands/finalize_spec.md`, or the user directs you to do otherwise.
|
| 34 |
-
- Do not stop after a single round unless the draft already satisfies all requirements in `@.cursor/commands/finalize_spec.md`.
|
| 35 |
</end_conditions>
|
| 36 |
</process>
|
| 37 |
|
|
@@ -42,6 +43,7 @@ READ: @.cursor/commands/finalize_spec.md to see the complete requirements for th
|
|
| 42 |
Using incrementing section numbers are essential for helping the user quickly reference specific options or questions in their responses.
|
| 43 |
Responses must strictly follow the Format section. Include only the specified sections and no additional commentary or subsections.
|
| 44 |
The agent is responsible for updating the spec draft after each user response.
|
|
|
|
| 45 |
</overview>
|
| 46 |
|
| 47 |
<guidelines>
|
|
@@ -52,7 +54,7 @@ READ: @.cursor/commands/finalize_spec.md to see the complete requirements for th
|
|
| 52 |
- Do not ask confirmation questions about facts stated by options; assume the selected option is authoritative.
|
| 53 |
- Use numbered sections that increment.
|
| 54 |
- Use incrementing decimals for recommendation options and request for input questions.
|
| 55 |
-
- After the user selects options, apply minimal, focused edits to the
|
| 56 |
- Do not clutter options or questions with information already clear and unambiguous from the current draft.
|
| 57 |
- Do not add subsections beyond those defined in the Format.
|
| 58 |
</guidelines>
|
|
@@ -60,7 +62,9 @@ READ: @.cursor/commands/finalize_spec.md to see the complete requirements for th
|
|
| 60 |
<format>
|
| 61 |
|
| 62 |
# Recommendations
|
|
|
|
| 63 |
## 1: Section Title
|
|
|
|
| 64 |
Short overview providing background on the section.
|
| 65 |
|
| 66 |
**Option 1.1**
|
|
@@ -70,16 +74,20 @@ Specifics of the first option.
|
|
| 70 |
Specifics of the second option.
|
| 71 |
|
| 72 |
## 2: Section Title
|
|
|
|
| 73 |
Short overview providing background on the section.
|
| 74 |
|
| 75 |
**Option 2.1**
|
| 76 |
Specifics of the first option.
|
| 77 |
|
| 78 |
# Request for Input
|
|
|
|
| 79 |
## 3: Section Title
|
|
|
|
| 80 |
Short overview providing background on the section.
|
| 81 |
|
| 82 |
**Questions**
|
|
|
|
| 83 |
- 3.1 Some question.
|
| 84 |
- 3.2 Another question.
|
| 85 |
|
|
@@ -99,14 +107,10 @@ Short overview providing background on the section.
|
|
| 99 |
7 Directions that indicate the users preference in response to the question.
|
| 100 |
8 Clear directive in response to the question.
|
| 101 |
```
|
|
|
|
| 102 |
</user_selection_format>
|
| 103 |
|
| 104 |
-
<selection_and_editing_rules>
|
| 105 |
-
- One and only one option must be selected per Recommendation. If none fit, request refinement.
|
| 106 |
-
- Apply edits narrowly: change only text directly impacted by the chosen option(s).
|
| 107 |
-
- Incorporate all information from the selected options into the draft.
|
| 108 |
-
- Remove or rewrite conflicting statements made obsolete by the selection.
|
| 109 |
-
- Preserve unrelated content and overall formatting; do not perform wide editorial passes.
|
| 110 |
</selection_and_editing_rules>
|
| 111 |
</response>
|
| 112 |
|
|
@@ -115,8 +119,41 @@ Short overview providing background on the section.
|
|
| 115 |
</guardrails>
|
| 116 |
|
| 117 |
<finalize_spec_compliance_checklist>
|
| 118 |
-
|
|
|
|
| 119 |
- [ ] Requirements are testable and unambiguous.
|
|
|
|
|
|
|
|
|
|
| 120 |
- [ ] Risks, dependencies, and assumptions captured.
|
| 121 |
- [ ] Approval received.
|
| 122 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
I am working on developing a comprehensive spec document for the next development sprint.
|
| 2 |
|
| 3 |
<goal>
|
| 4 |
+
Solidify the current spec document into a comprehensive specification for the next development sprint through iterative refinement.
|
| 5 |
|
| 6 |
The spec draft represents the rough notes and ideas for the next sprint. These notes are likely incomplete and require additional details and decisions to obtain sufficient information to move forward with the sprint.
|
| 7 |
|
| 8 |
+
READ: `<finalized_spec_requirements>` to see the complete requirements for the finalized spec. The goal is to reach the level of specificity and clarity required to create this final spec.
|
| 9 |
</goal>
|
| 10 |
|
| 11 |
<process>
|
| 12 |
<overview>
|
| 13 |
+
|
| 14 |
+
Iteratively carry out the following steps to progressively refine the requirements for this sprint. Use `Requests for Input` only to gather information that cannot be inferred from the user's selection of a Recommendation; do not ask to confirm details already specified by a selected option. The initial spec draft may be a loose assortment of notes, ideas, and thoughts; treat it accordingly in the first round.
|
| 15 |
|
| 16 |
First round: produce a response that includes Recommendations and Requests for Input. The user will reply by selecting exactly one option per Recommendation (or asking for refinement if none fit) and answering only those questions that cannot be inferred from selected options.
|
| 17 |
|
| 18 |
+
After each user response: update the spec draft to incorporate the selected options with minimal, focused edits. Remove any conflicting or superseded information made obsolete by the selection. Avoid unrelated formatting or editorial changes.
|
| 19 |
+
|
| 20 |
+
Repeat this back-and-forth until ambiguity is removed and the draft aligns with the requirements in `<finalized_spec_requirements>`.
|
| 21 |
|
|
|
|
| 22 |
</overview>
|
| 23 |
|
| 24 |
<steps>
|
| 25 |
+
- READ the spec draft.
|
| 26 |
- IDENTIFY anything in the spec draft that is confusing, conflicting, unclear, or missing. Identify important decisions that need to be made.
|
| 27 |
- REVIEW the current state of the project to fully understand how these new requirements fit into what already exists.
|
| 28 |
+
- RESEARCH any technical questions, library options, or implementation approaches that need to be resolved. Conduct this research during spec development so that specific, concrete guidance can be included in the final spec rather than leaving research tasks for the implementer.
|
| 29 |
- RECOMMEND specific additions or updates to the draft spec to resolve confusion, add clarity, fill gaps, or add specificity. Recommendations may provide a single option when appropriate or multiple options when needed. Each Recommendation expects selection of one and only one option by the user.
|
| 30 |
- ASK targeted questions to acquire details, decisions, or preferences from the user.
|
| 31 |
+
- APPLY the user's selections: make minimal, localized edits to the spec draft to incorporate the chosen options and remove conflicting content. Incorporate all information contained in the selected options; do not omit details. Do not change unrelated text, structure, or formatting.
|
| 32 |
- REFINE: if the user rejects the provided options, revise the Recommendations based on feedback and repeat selection and apply.
|
| 33 |
</steps>
|
| 34 |
|
| 35 |
+
<end_conditions> - Continue this process until the draft is unambiguous and conforms to `<finalized_spec_requirements>`, or the user directs you to do otherwise. - Do not stop after a single round unless the draft already satisfies all requirements in `<finalized_spec_requirements>`.
|
|
|
|
|
|
|
| 36 |
</end_conditions>
|
| 37 |
</process>
|
| 38 |
|
|
|
|
| 43 |
Using incrementing section numbers are essential for helping the user quickly reference specific options or questions in their responses.
|
| 44 |
Responses must strictly follow the Format section. Include only the specified sections and no additional commentary or subsections.
|
| 45 |
The agent is responsible for updating the spec draft after each user response.
|
| 46 |
+
|
| 47 |
</overview>
|
| 48 |
|
| 49 |
<guidelines>
|
|
|
|
| 54 |
- Do not ask confirmation questions about facts stated by options; assume the selected option is authoritative.
|
| 55 |
- Use numbered sections that increment.
|
| 56 |
- Use incrementing decimals for recommendation options and request for input questions.
|
| 57 |
+
- After the user selects options, apply minimal, focused edits to the spec draft reflecting only those selections. Remove conflicting or superseded content. Avoid broad formatting or editorial changes to unrelated content.
|
| 58 |
- Do not clutter options or questions with information already clear and unambiguous from the current draft.
|
| 59 |
- Do not add subsections beyond those defined in the Format.
|
| 60 |
</guidelines>
|
|
|
|
| 62 |
<format>
|
| 63 |
|
| 64 |
# Recommendations
|
| 65 |
+
|
| 66 |
## 1: Section Title
|
| 67 |
+
|
| 68 |
Short overview providing background on the section.
|
| 69 |
|
| 70 |
**Option 1.1**
|
|
|
|
| 74 |
Specifics of the second option.
|
| 75 |
|
| 76 |
## 2: Section Title
|
| 77 |
+
|
| 78 |
Short overview providing background on the section.
|
| 79 |
|
| 80 |
**Option 2.1**
|
| 81 |
Specifics of the first option.
|
| 82 |
|
| 83 |
# Request for Input
|
| 84 |
+
|
| 85 |
## 3: Section Title
|
| 86 |
+
|
| 87 |
Short overview providing background on the section.
|
| 88 |
|
| 89 |
**Questions**
|
| 90 |
+
|
| 91 |
- 3.1 Some question.
|
| 92 |
- 3.2 Another question.
|
| 93 |
|
|
|
|
| 107 |
7 Directions that indicate the users preference in response to the question.
|
| 108 |
8 Clear directive in response to the question.
|
| 109 |
```
|
| 110 |
+
|
| 111 |
</user_selection_format>
|
| 112 |
|
| 113 |
+
<selection_and_editing_rules> - One and only one option must be selected per Recommendation. If none fit, request refinement. - Apply edits narrowly: change only text directly impacted by the chosen option(s). - Incorporate all information from the selected options into the draft. - Remove or rewrite conflicting statements made obsolete by the selection. - Preserve unrelated content and overall formatting; do not perform wide editorial passes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
</selection_and_editing_rules>
|
| 115 |
</response>
|
| 116 |
|
|
|
|
| 119 |
</guardrails>
|
| 120 |
|
| 121 |
<finalize_spec_compliance_checklist>
|
| 122 |
+
|
| 123 |
+
- [ ] All information required by `<finalized_spec_requirements>` are present.
|
| 124 |
- [ ] Requirements are testable and unambiguous.
|
| 125 |
+
- [ ] All research completed and findings documented in the spec.
|
| 126 |
+
- [ ] All decisions made and documented; no decision-making left for the implementer.
|
| 127 |
+
- [ ] No research tasks or decision points left for the implementer (or explicitly documented as blockers).
|
| 128 |
- [ ] Risks, dependencies, and assumptions captured.
|
| 129 |
- [ ] Approval received.
|
| 130 |
+
|
| 131 |
+
</finalize_spec_compliance_checklist>
|
| 132 |
+
|
| 133 |
+
<finalized_spec_requirements>
|
| 134 |
+
The spec acts as the comprehensive source of truth for this sprint and should include all the necessary context and technical details to implement this sprint. It should leave no ambiguity for important details necessary to properly implement the changes required.
|
| 135 |
+
|
| 136 |
+
The spec.md will act as a reference for an LLM coding agent responsible for completing this sprint.
|
| 137 |
+
|
| 138 |
+
The spec must not include any directions for the implementer to conduct research or make decisions. All research must be completed during spec development, and all decisions must be made and documented in the spec. If there are pending decisions or research that cannot be completed during spec development, these must be explicitly documented as blockers or prerequisites that prevent implementation from proceeding.
|
| 139 |
+
|
| 140 |
+
The spec should include the following information if applicable:
|
| 141 |
+
|
| 142 |
+
- An overview of the changes implemented in this sprint.
|
| 143 |
+
- User stories for the new functionality, if applicable.
|
| 144 |
+
- An outline of any new data models proposed.
|
| 145 |
+
- An other technical details determined in the spec_draft or related conversations.
|
| 146 |
+
- Specific filepaths for files for any files that need to be added, edited, or deleted as part of this sprint.
|
| 147 |
+
- Specific files or modules relevant to this sprint.
|
| 148 |
+
- Details on how things should function such as a function, workflow, or other process.
|
| 149 |
+
- Describe what any new functions, services, ect. are supposed to do.
|
| 150 |
+
- Any reasoning or rationale behind decisions, preferences, or changes that provides context for the sprint and its changes.
|
| 151 |
+
- Any other information required to properly understand this sprint, the desired changes, the expected deliverables, or important technical details.
|
| 152 |
+
|
| 153 |
+
Strive to retain all the final decisions and implementation details provided in the spec draft and related conversations. Cleaning and organizing these raw notes is desirable, but do not exclude or leave out information provided in the spec draft if it is relevant to this sprint. If there is information in the spec draft that is outdated and negated or revised by further direction in the draft or related conversation, you should leave that stale information out of the final spec.
|
| 154 |
+
|
| 155 |
+
The spec should have all the information a junior developer needs to complete this sprint. They should be able to independently find answers to any questions they have about this sprint and how to implement it in this document. The spec defines exactly what should be implemented and how; it does not require the implementer to make decisions or conduct research. All technical research, library selection, design decisions, and implementation approaches must be resolved and documented in the spec before implementation begins.
|
| 156 |
+
|
| 157 |
+
**Code Examples in Specs:**
|
| 158 |
+
Use code examples sparingly and only when they provide clarity that text cannot achieve. Keep examples small and focused on specific scenarios, usage patterns, or situations that are difficult to express concisely in prose. Prefer code examples when they are more explicit or concise than equivalent text descriptions. Avoid code examples for obvious implementations or concepts that can be clearly explained in bullet points or brief text. If explicitly directed, longer code examples are appropriate. The guiding principle is to maintain a balance of conciseness, precision, and comprehensiveness—choose the format (code or text) that best achieves this balance.
|
| 159 |
+
</finalized_spec_requirements>
|
.cursor/commands/finalize_spec.md
DELETED
|
@@ -1,21 +0,0 @@
|
|
| 1 |
-
Convert the spec_draft document into a final draft in the spec.md file.
|
| 2 |
-
|
| 3 |
-
The spec acts as the comprehensive source of truth for this sprint and should include all the necessary context and technical details to implement this sprint. It should leave no ambiguity for important details necessary to properly implement the changes required.
|
| 4 |
-
|
| 5 |
-
The spec.md will act as a reference for an LLM coding agent responsible for completing this sprint.
|
| 6 |
-
|
| 7 |
-
The spec should include the following information if applicable:
|
| 8 |
-
- An overview of the changes implemented in this sprint.
|
| 9 |
-
- User stories for the new functionality, if applicable.
|
| 10 |
-
- An outline of any new data models proposed.
|
| 11 |
-
- An other technical details determined in the spec_draft or related conversations.
|
| 12 |
-
- Specific filepaths for files for any files that need to be added, edited, or deleted as part of this sprint.
|
| 13 |
-
- Specific files or modules relevant to this sprint.
|
| 14 |
-
- Details on how things should function such as a function, workflow, or other process.
|
| 15 |
-
- Describe what any new functions, services, ect. are supposed to do.
|
| 16 |
-
- Any reasoning or rationale behind decisions, preferences, or changes that provides context for the sprint and its changes.
|
| 17 |
-
- Any other information required to properly understand this sprint, the desired changes, the expected deliverables, or important technical details.
|
| 18 |
-
|
| 19 |
-
Strive to retain all the final decisions and implementation details provided in the spec_draft and related conversations. Cleaning and organizing these raw notes is desirable, but do not exclude or leave out information provided in the spec_draft if it is relevant to this sprint. If there is information in the spec_draft that is outdated and negated or revised by further direction in the draft or related conversation, you should leave that stale information out of the final spec.
|
| 20 |
-
|
| 21 |
-
The spec should have all the information a junior developer needs to complete this sprint. They should be able to independently find answers to any questions they have about this sprint and how to implement it in this document.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
.cursor/rules/standards/code_style/readme.mdc
ADDED
|
@@ -0,0 +1,99 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
description: Guidelines for writing README documents.
|
| 3 |
+
alwaysApply: false
|
| 4 |
+
---
|
| 5 |
+
|
| 6 |
+
# README Generation Rules
|
| 7 |
+
|
| 8 |
+
You are an expert technical writer and software engineer. When asked to write, update, or critique a README.md, adhere strictly to the following principles, style guide, and structure.
|
| 9 |
+
|
| 10 |
+
## 1. Core Principles
|
| 11 |
+
|
| 12 |
+
- **The 15-Minute Rule:** The primary goal is to minimize "Time-to-Hello-World." A user must be able to install and run the project (or a specific example) within 15 minutes.
|
| 13 |
+
- **Truth Over Fluff:** NEVER hallucinate features. If a feature is planned, put it in a "Roadmap" section. If code does not exist to support a claim, do not write it.
|
| 14 |
+
- **Inverted Pyramid:** Place the most critical information (What is it? How do I run it?) at the top.
|
| 15 |
+
|
| 16 |
+
## 2. Tone & Style Guidelines
|
| 17 |
+
|
| 18 |
+
- **No Marketing Fluff:** DELETE adjectives like "seamless," "easy," "robust," "blazing fast," "state-of-the-art," and "comprehensive." Let the code prove its value.
|
| 19 |
+
- **Active Voice:** Use the imperative mood.
|
| 20 |
+
- _Bad:_ "The script can be run by the user..."
|
| 21 |
+
- _Good:_ "Run the script..."
|
| 22 |
+
- **Present Tense:** Avoid "will."
|
| 23 |
+
- _Bad:_ "Clicking save will write the file."
|
| 24 |
+
- _Good:_ "Click save. The system writes the file."
|
| 25 |
+
- **Second Person:** Address the user as "you" (implied).
|
| 26 |
+
- **Concrete & Specific:**
|
| 27 |
+
- _Bad:_ "Requires a database."
|
| 28 |
+
- _Good:_ "Requires PostgreSQL v14+ running on port 5432."
|
| 29 |
+
|
| 30 |
+
## 3. Formatting Standards (Markdown)
|
| 31 |
+
|
| 32 |
+
- **Semantic Headers:** DO NOT skip header levels (e.g., jumping from H1 to H3).
|
| 33 |
+
- **Code Blocks:**
|
| 34 |
+
- ALWAYS use fenced code blocks (three backticks) with a language identifier (e.g., `bash, `python).
|
| 35 |
+
- NEVER use indentation (4 spaces) for code blocks.
|
| 36 |
+
- **Inline Code:** Use single backticks for: filenames, directories, variable names, methods, and boolean values (`true`/`false`). Do not bold these.
|
| 37 |
+
- **Lists:**
|
| 38 |
+
- Use hyphens `-` for bullet points.
|
| 39 |
+
- Use `1.` for ALL numbered list items (Markdown renders the order automatically).
|
| 40 |
+
- **Alerts:** Use GitHub-standard alert syntax:
|
| 41 |
+
```markdown
|
| 42 |
+
> [!NOTE]
|
| 43 |
+
> text
|
| 44 |
+
```
|
| 45 |
+
- **Links:** Use descriptive link text. Never use "click here."
|
| 46 |
+
|
| 47 |
+
## 4. Structural Template & Logic
|
| 48 |
+
|
| 49 |
+
Determine if the project is a **Library** (code used by other code) or an **Application** (standalone tool).
|
| 50 |
+
|
| 51 |
+
### Section 1: The Hook (Required)
|
| 52 |
+
|
| 53 |
+
- **H1:** Project Name.
|
| 54 |
+
- **Description:** ONE paragraph.
|
| 55 |
+
1. What is it? (Concrete noun).
|
| 56 |
+
2. What problem does it solve?
|
| 57 |
+
3. Why is it distinct? (Metrics, not adjectives).
|
| 58 |
+
- **Bad Description:** "A holistic solution for data."
|
| 59 |
+
- **Good Description:** "A Python library that converts CSV to JSON without loading the file into memory."
|
| 60 |
+
|
| 61 |
+
### Section 2: Visuals (Required)
|
| 62 |
+
|
| 63 |
+
- Include a placeholder for a screenshot, GIF, or terminal output.
|
| 64 |
+
- ``
|
| 65 |
+
|
| 66 |
+
### Section 3: Installation & Usage (Context Dependent)
|
| 67 |
+
|
| 68 |
+
**IF LIBRARY (e.g., npm, pip):**
|
| 69 |
+
|
| 70 |
+
1. **Install:** `pip install package-name`
|
| 71 |
+
2. **Quick Start:** Provide a "Copy-Paste" block.
|
| 72 |
+
- MUST be a self-contained code snippet that actually runs.
|
| 73 |
+
- MUST use real API method names found in the context.
|
| 74 |
+
- DO NOT use generic placeholders like `foo` or `bar` unless necessary.
|
| 75 |
+
|
| 76 |
+
**IF APPLICATION (e.g., Web App, CLI):**
|
| 77 |
+
|
| 78 |
+
1. **Prerequisites:** Strict list (e.g., "Node v18+", "Docker").
|
| 79 |
+
2. **Setup:**
|
| 80 |
+
```bash
|
| 81 |
+
git clone [repo]
|
| 82 |
+
npm install
|
| 83 |
+
cp .env.example .env
|
| 84 |
+
npm run start
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
### Section 4: Deep Dive (Optional but Recommended)
|
| 88 |
+
|
| 89 |
+
- **Configuration:** Env variables, flags.
|
| 90 |
+
- **Architecture:** High-level diagram explanation (if complex).
|
| 91 |
+
- **Roadmap:** Planned features (clearly marked as "Future").
|
| 92 |
+
|
| 93 |
+
## 5. "AI-Proofing" Verification Checklist
|
| 94 |
+
|
| 95 |
+
Before finalizing the output, perform these checks:
|
| 96 |
+
|
| 97 |
+
1. **Hallucination Check:** Do the installation commands actually exist in the codebase (e.g., is there a `requirements.txt` or `package.json` matching the install instructions)?
|
| 98 |
+
2. **API Check:** Do the methods used in the "Quick Start" match the actual function definitions in the provided files?
|
| 99 |
+
3. **Adverb Purge:** Remove all adverbs ending in "ly" (e.g., "automatically," "intuitively") unless essential for technical accuracy.
|
.env.example
CHANGED
|
@@ -1,7 +1,13 @@
|
|
| 1 |
-
# Common Standards Project API Configuration
|
| 2 |
-
CSP_API_KEY=your_generated_api_key_here
|
| 3 |
-
|
| 4 |
# Pinecone Configuration
|
| 5 |
-
|
|
|
|
| 6 |
PINECONE_INDEX_NAME=common-core-standards
|
| 7 |
PINECONE_NAMESPACE=standards
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Pinecone Configuration
|
| 2 |
+
# Get your API key from https://app.pinecone.io/
|
| 3 |
+
PINECONE_API_KEY=your_api_key_here
|
| 4 |
PINECONE_INDEX_NAME=common-core-standards
|
| 5 |
PINECONE_NAMESPACE=standards
|
| 6 |
+
|
| 7 |
+
# Hugging Face Configuration
|
| 8 |
+
# Get your token from https://huggingface.co/settings/tokens
|
| 9 |
+
# Required for chat interface Inference API access
|
| 10 |
+
HF_TOKEN=your_huggingface_token_here
|
| 11 |
+
|
| 12 |
+
# Note: MCP_SERVER_URL is not needed since we call functions directly
|
| 13 |
+
# The MCP server is automatically exposed by Gradio when mcp_server=True
|
app.py
ADDED
|
@@ -0,0 +1,419 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Gradio MCP server for Common Core Standards search and lookup."""
|
| 2 |
+
|
| 3 |
+
import os
|
| 4 |
+
import json
|
| 5 |
+
from typing import Any
|
| 6 |
+
|
| 7 |
+
from dotenv import load_dotenv
|
| 8 |
+
import gradio as gr
|
| 9 |
+
from huggingface_hub import InferenceClient
|
| 10 |
+
|
| 11 |
+
# Load environment variables from .env file
|
| 12 |
+
load_dotenv()
|
| 13 |
+
|
| 14 |
+
from src.search import find_relevant_standards_impl
|
| 15 |
+
from src.lookup import get_standard_details_impl
|
| 16 |
+
|
| 17 |
+
# Initialize the Hugging Face Inference Client
|
| 18 |
+
# Use HF_TOKEN from environment (automatically available in Hugging Face Spaces)
|
| 19 |
+
# Provider is required for models that need Inference Providers (e.g., Together AI, Nebius)
|
| 20 |
+
HF_TOKEN = os.environ.get("HF_TOKEN")
|
| 21 |
+
client = InferenceClient(
|
| 22 |
+
provider="together", # Required: specifies the inference provider for tool calling
|
| 23 |
+
token=HF_TOKEN
|
| 24 |
+
)
|
| 25 |
+
|
| 26 |
+
# Define the function schemas in OpenAI format for the model
|
| 27 |
+
TOOLS = [
|
| 28 |
+
{
|
| 29 |
+
"type": "function",
|
| 30 |
+
"function": {
|
| 31 |
+
"name": "find_relevant_standards",
|
| 32 |
+
"description": "Searches for educational standards relevant to a learning activity using semantic search. Use this when the user asks about standards for a specific activity, lesson, or educational objective.",
|
| 33 |
+
"parameters": {
|
| 34 |
+
"type": "object",
|
| 35 |
+
"properties": {
|
| 36 |
+
"activity": {
|
| 37 |
+
"type": "string",
|
| 38 |
+
"description": "A natural language description of the learning activity, lesson, or educational objective. Be specific and descriptive."
|
| 39 |
+
},
|
| 40 |
+
"max_results": {
|
| 41 |
+
"type": "integer",
|
| 42 |
+
"description": "Maximum number of standards to return (1-20). Default is 5.",
|
| 43 |
+
"default": 5,
|
| 44 |
+
"minimum": 1,
|
| 45 |
+
"maximum": 20
|
| 46 |
+
},
|
| 47 |
+
"grade": {
|
| 48 |
+
"type": "string",
|
| 49 |
+
"description": "Optional grade level filter. Valid values: K, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, or 09-12 for high school range.",
|
| 50 |
+
"enum": ["K", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "09-12"]
|
| 51 |
+
}
|
| 52 |
+
},
|
| 53 |
+
"required": ["activity"]
|
| 54 |
+
}
|
| 55 |
+
}
|
| 56 |
+
},
|
| 57 |
+
{
|
| 58 |
+
"type": "function",
|
| 59 |
+
"function": {
|
| 60 |
+
"name": "get_standard_details",
|
| 61 |
+
"description": "Retrieves complete metadata and content for a specific educational standard by its GUID (_id field). Use this when you have the exact GUID from a previous search result. This function ONLY accepts GUIDs, not statement notations or other identifiers. For searching by content or notation, use find_relevant_standards instead.",
|
| 62 |
+
"parameters": {
|
| 63 |
+
"type": "object",
|
| 64 |
+
"properties": {
|
| 65 |
+
"standard_id": {
|
| 66 |
+
"type": "string",
|
| 67 |
+
"description": "The standard's GUID (_id field) - must be a valid GUID format (e.g., 'EA60C8D165F6481B90BFF782CE193F93'). This function does NOT accept statement notations or other identifier formats."
|
| 68 |
+
}
|
| 69 |
+
},
|
| 70 |
+
"required": ["standard_id"]
|
| 71 |
+
}
|
| 72 |
+
}
|
| 73 |
+
}
|
| 74 |
+
]
|
| 75 |
+
|
| 76 |
+
def find_relevant_standards(
|
| 77 |
+
activity: str,
|
| 78 |
+
max_results: int = 5,
|
| 79 |
+
grade: str | None = None,
|
| 80 |
+
) -> str:
|
| 81 |
+
"""
|
| 82 |
+
Searches for educational standards relevant to a learning activity using semantic search.
|
| 83 |
+
|
| 84 |
+
This function performs a vector similarity search over the Common Core Standards database
|
| 85 |
+
to find standards that match the described learning activity. Results are ranked by relevance
|
| 86 |
+
and can be filtered by grade level.
|
| 87 |
+
|
| 88 |
+
Args:
|
| 89 |
+
activity: A natural language description of the learning activity, lesson, or educational
|
| 90 |
+
objective. Examples: "teaching fractions to third graders", "reading comprehension
|
| 91 |
+
activities", "solving quadratic equations". This is the primary search query and should
|
| 92 |
+
be descriptive and specific for best results.
|
| 93 |
+
|
| 94 |
+
max_results: The maximum number of standards to return. Must be between 1 and 20.
|
| 95 |
+
Default is 5. Higher values return more results but may include less relevant matches.
|
| 96 |
+
|
| 97 |
+
grade: Optional grade level filter. Must be one of the following valid grade level codes:
|
| 98 |
+
- "K" for Kindergarten
|
| 99 |
+
- "01" for Grade 1
|
| 100 |
+
- "02" for Grade 2
|
| 101 |
+
- "03" for Grade 3
|
| 102 |
+
- "04" for Grade 4
|
| 103 |
+
- "05" for Grade 5
|
| 104 |
+
- "06" for Grade 6
|
| 105 |
+
- "07" for Grade 7
|
| 106 |
+
- "08" for Grade 8
|
| 107 |
+
- "09" for Grade 9
|
| 108 |
+
- "10" for Grade 10
|
| 109 |
+
- "11" for Grade 11
|
| 110 |
+
- "12" for Grade 12
|
| 111 |
+
- "09-12" for high school range (when standards span multiple high school grades)
|
| 112 |
+
|
| 113 |
+
If None or empty string, no grade filtering is applied and standards from all grade
|
| 114 |
+
levels may be returned. The grade filter uses exact matching against the education_levels
|
| 115 |
+
metadata field in the database.
|
| 116 |
+
|
| 117 |
+
Returns:
|
| 118 |
+
A JSON string containing a structured response with the following format:
|
| 119 |
+
{
|
| 120 |
+
"success": true|false,
|
| 121 |
+
"results": [
|
| 122 |
+
{
|
| 123 |
+
"_id": "standard_guid",
|
| 124 |
+
"content": "full standard text with hierarchy",
|
| 125 |
+
"subject": "Mathematics",
|
| 126 |
+
"education_levels": ["03"],
|
| 127 |
+
"statement_notation": "3.NF.A.1",
|
| 128 |
+
"standard_set_title": "Grade 3",
|
| 129 |
+
"score": 0.85
|
| 130 |
+
},
|
| 131 |
+
...
|
| 132 |
+
],
|
| 133 |
+
"message": "Found N matching standards" or error message,
|
| 134 |
+
"error_type": null or error type if success is false
|
| 135 |
+
}
|
| 136 |
+
|
| 137 |
+
On success, the results array contains up to max_results standards, sorted by relevance
|
| 138 |
+
score (highest first). Each result includes the full standard content, metadata, and
|
| 139 |
+
relevance score. On error, success is false and an error message describes the issue.
|
| 140 |
+
"""
|
| 141 |
+
# Handle empty string from dropdown (convert to None)
|
| 142 |
+
if grade == "":
|
| 143 |
+
grade = None
|
| 144 |
+
|
| 145 |
+
# Ensure max_results is an integer (gr.Number returns float by default)
|
| 146 |
+
max_results = int(max_results)
|
| 147 |
+
|
| 148 |
+
return find_relevant_standards_impl(activity, max_results, grade)
|
| 149 |
+
|
| 150 |
+
|
| 151 |
+
def get_standard_details(standard_id: str) -> str:
|
| 152 |
+
"""
|
| 153 |
+
Retrieves complete metadata and content for a specific educational standard by its GUID.
|
| 154 |
+
|
| 155 |
+
This function performs a direct lookup using the standard's GUID (_id field) only.
|
| 156 |
+
It does NOT accept statement notations, ASN identifiers, or any other identifier formats.
|
| 157 |
+
Use find_relevant_standards to search for standards by content or metadata.
|
| 158 |
+
|
| 159 |
+
Args:
|
| 160 |
+
standard_id: The standard's GUID (_id field) - must be a valid GUID format
|
| 161 |
+
(e.g., "EA60C8D165F6481B90BFF782CE193F93"). This is the GUID returned in
|
| 162 |
+
search results from find_relevant_standards.
|
| 163 |
+
|
| 164 |
+
Returns:
|
| 165 |
+
A JSON string containing a structured response with the following format:
|
| 166 |
+
{
|
| 167 |
+
"success": true|false,
|
| 168 |
+
"results": [
|
| 169 |
+
{
|
| 170 |
+
"_id": "standard_guid",
|
| 171 |
+
"content": "full standard text with hierarchy",
|
| 172 |
+
"subject": "Mathematics",
|
| 173 |
+
"education_levels": ["03"],
|
| 174 |
+
"statement_notation": "3.NF.A.1",
|
| 175 |
+
"standard_set_title": "Grade 3",
|
| 176 |
+
"asn_identifier": "S21238682",
|
| 177 |
+
"depth": 3,
|
| 178 |
+
"is_leaf": true,
|
| 179 |
+
"parent_id": "parent_guid",
|
| 180 |
+
"ancestor_ids": [...],
|
| 181 |
+
"child_ids": [...],
|
| 182 |
+
... (all available metadata fields)
|
| 183 |
+
}
|
| 184 |
+
],
|
| 185 |
+
"message": "Retrieved standard details" or error message,
|
| 186 |
+
"error_type": null or error type if success is false
|
| 187 |
+
}
|
| 188 |
+
|
| 189 |
+
On success, the results array contains exactly one standard object with all available
|
| 190 |
+
metadata fields including hierarchy relationships, content, and identifiers. On error
|
| 191 |
+
(e.g., standard not found), success is false and the message provides guidance, such as
|
| 192 |
+
suggesting to use find_relevant_standards for searching.
|
| 193 |
+
|
| 194 |
+
Raises:
|
| 195 |
+
This function does not raise exceptions. All errors are returned as JSON responses
|
| 196 |
+
with success=false and appropriate error messages.
|
| 197 |
+
"""
|
| 198 |
+
return get_standard_details_impl(standard_id)
|
| 199 |
+
|
| 200 |
+
|
| 201 |
+
def chat_with_standards(message: str, history: list):
|
| 202 |
+
"""
|
| 203 |
+
Chat function that uses MCP tools via Hugging Face Inference API with tool calling.
|
| 204 |
+
|
| 205 |
+
This function integrates with Qwen2.5-7B-Instruct to answer questions about educational
|
| 206 |
+
standards. The model can call find_relevant_standards and get_standard_details tools
|
| 207 |
+
to retrieve information and provide accurate responses.
|
| 208 |
+
|
| 209 |
+
Args:
|
| 210 |
+
message: The user's current message/query
|
| 211 |
+
history: Chat history in Gradio 6 messages format. Each message is a dict with
|
| 212 |
+
"role" and "content" keys. In Gradio 6, content uses structured format:
|
| 213 |
+
[{"type": "text", "text": "..."}, ...] for text content.
|
| 214 |
+
|
| 215 |
+
Returns:
|
| 216 |
+
Structured content as a list of content blocks. When tool calls are made, includes:
|
| 217 |
+
- Expandable JSON blocks showing tool call results
|
| 218 |
+
- The final assistant response as text
|
| 219 |
+
When no tool calls are made, returns a simple text response.
|
| 220 |
+
"""
|
| 221 |
+
# Convert Gradio 6 history format to OpenAI messages format
|
| 222 |
+
# Gradio 6 uses structured content: {"role": "user", "content": [{"type": "text", "text": "..."}]}
|
| 223 |
+
messages = []
|
| 224 |
+
if history:
|
| 225 |
+
for msg in history:
|
| 226 |
+
if isinstance(msg, dict):
|
| 227 |
+
role = msg.get("role", "user")
|
| 228 |
+
content = msg.get("content", "")
|
| 229 |
+
|
| 230 |
+
# Handle Gradio 6 structured content format
|
| 231 |
+
if isinstance(content, list):
|
| 232 |
+
# Extract text from content blocks
|
| 233 |
+
text_parts = []
|
| 234 |
+
for block in content:
|
| 235 |
+
if isinstance(block, dict) and block.get("type") == "text":
|
| 236 |
+
text_parts.append(block.get("text", ""))
|
| 237 |
+
content = " ".join(text_parts)
|
| 238 |
+
|
| 239 |
+
messages.append({
|
| 240 |
+
"role": role,
|
| 241 |
+
"content": content
|
| 242 |
+
})
|
| 243 |
+
|
| 244 |
+
# Add system message to guide the model
|
| 245 |
+
system_message = {
|
| 246 |
+
"role": "system",
|
| 247 |
+
"content": "You are a helpful assistant for parents and teachers. Your role is to help them plan educational activities and find educational requirements for activities they might have already done. You have access to tools that can search for standards and retrieve standard details. Use these tools when users ask about standards, learning activities, or educational requirements. Always provide clear, helpful responses based on the tool results."
|
| 248 |
+
}
|
| 249 |
+
|
| 250 |
+
# Add current user message
|
| 251 |
+
messages.append({"role": "user", "content": message})
|
| 252 |
+
|
| 253 |
+
# Prepare full message list with system message
|
| 254 |
+
full_messages = [system_message] + messages
|
| 255 |
+
|
| 256 |
+
try:
|
| 257 |
+
# Initial API call with tools
|
| 258 |
+
response = client.chat.completions.create(
|
| 259 |
+
model="Qwen/Qwen2.5-7B-Instruct",
|
| 260 |
+
messages=full_messages,
|
| 261 |
+
tools=TOOLS,
|
| 262 |
+
tool_choice="auto", # Let the model decide when to call functions
|
| 263 |
+
temperature=0.7,
|
| 264 |
+
max_tokens=1000,
|
| 265 |
+
)
|
| 266 |
+
|
| 267 |
+
response_message = response.choices[0].message
|
| 268 |
+
|
| 269 |
+
# Check if model wants to call functions
|
| 270 |
+
if response_message.tool_calls:
|
| 271 |
+
# Add assistant's tool call request to messages
|
| 272 |
+
full_messages.append(response_message)
|
| 273 |
+
|
| 274 |
+
# Store tool call results for display
|
| 275 |
+
tool_results = []
|
| 276 |
+
|
| 277 |
+
# Process each tool call
|
| 278 |
+
for tool_call in response_message.tool_calls:
|
| 279 |
+
function_name = tool_call.function.name
|
| 280 |
+
function_args = json.loads(tool_call.function.arguments)
|
| 281 |
+
|
| 282 |
+
# Execute the function
|
| 283 |
+
if function_name == "find_relevant_standards":
|
| 284 |
+
print(f"Finding relevant standards for activity: {function_args.get('activity', '')}")
|
| 285 |
+
result = find_relevant_standards_impl(
|
| 286 |
+
activity=function_args.get("activity", ""),
|
| 287 |
+
max_results=function_args.get("max_results", 5),
|
| 288 |
+
grade=function_args.get("grade"),
|
| 289 |
+
)
|
| 290 |
+
elif function_name == "get_standard_details":
|
| 291 |
+
print(f"Getting standard details for standard ID: {function_args.get('standard_id', '')}")
|
| 292 |
+
result = get_standard_details_impl(
|
| 293 |
+
standard_id=function_args.get("standard_id", "")
|
| 294 |
+
)
|
| 295 |
+
else:
|
| 296 |
+
result = json.dumps({"error": f"Function {function_name} not available"})
|
| 297 |
+
|
| 298 |
+
# Parse result JSON for display
|
| 299 |
+
try:
|
| 300 |
+
result_data = json.loads(result) if isinstance(result, str) else result
|
| 301 |
+
except json.JSONDecodeError:
|
| 302 |
+
result_data = {"raw_result": result}
|
| 303 |
+
|
| 304 |
+
# Store tool call info for display
|
| 305 |
+
tool_results.append({
|
| 306 |
+
"function": function_name,
|
| 307 |
+
"arguments": function_args,
|
| 308 |
+
"result": result_data
|
| 309 |
+
})
|
| 310 |
+
|
| 311 |
+
# Add function result to messages
|
| 312 |
+
full_messages.append({
|
| 313 |
+
"role": "tool",
|
| 314 |
+
"tool_call_id": tool_call.id,
|
| 315 |
+
"name": function_name,
|
| 316 |
+
"content": result,
|
| 317 |
+
})
|
| 318 |
+
|
| 319 |
+
# Get final response with function results
|
| 320 |
+
final_response = client.chat.completions.create(
|
| 321 |
+
model="Qwen/Qwen2.5-7B-Instruct",
|
| 322 |
+
messages=full_messages,
|
| 323 |
+
temperature=0.7,
|
| 324 |
+
max_tokens=1000,
|
| 325 |
+
)
|
| 326 |
+
|
| 327 |
+
# Build structured response with tool call results and final answer
|
| 328 |
+
response_blocks = []
|
| 329 |
+
|
| 330 |
+
# Add tool call results as expandable JSON blocks using markdown
|
| 331 |
+
for i, tool_result in enumerate(tool_results):
|
| 332 |
+
# Format arguments and result as pretty JSON
|
| 333 |
+
args_json = json.dumps(tool_result["arguments"], indent=2)
|
| 334 |
+
result_json = json.dumps(tool_result["result"], indent=2)
|
| 335 |
+
|
| 336 |
+
# Create collapsible markdown section
|
| 337 |
+
tool_markdown = f"""<details>
|
| 338 |
+
<summary><strong>🔧 Tool Call: {tool_result["function"]}</strong></summary>
|
| 339 |
+
|
| 340 |
+
**Arguments:**
|
| 341 |
+
```json
|
| 342 |
+
{args_json}
|
| 343 |
+
```
|
| 344 |
+
|
| 345 |
+
**Result:**
|
| 346 |
+
```json
|
| 347 |
+
{result_json}
|
| 348 |
+
```
|
| 349 |
+
</details>
|
| 350 |
+
"""
|
| 351 |
+
response_blocks.append({
|
| 352 |
+
"type": "text",
|
| 353 |
+
"text": tool_markdown
|
| 354 |
+
})
|
| 355 |
+
|
| 356 |
+
# Add separator before final response
|
| 357 |
+
response_blocks.append({
|
| 358 |
+
"type": "text",
|
| 359 |
+
"text": "---\n"
|
| 360 |
+
})
|
| 361 |
+
|
| 362 |
+
# Add final assistant response as text
|
| 363 |
+
response_blocks.append({
|
| 364 |
+
"type": "text",
|
| 365 |
+
"text": final_response.choices[0].message.content
|
| 366 |
+
})
|
| 367 |
+
|
| 368 |
+
return response_blocks
|
| 369 |
+
else:
|
| 370 |
+
# No tool calls, return direct response as text
|
| 371 |
+
return response_message.content
|
| 372 |
+
|
| 373 |
+
except Exception as e:
|
| 374 |
+
# Error handling
|
| 375 |
+
return f"I apologize, but I encountered an error: {str(e)}. Please try again or rephrase your question."
|
| 376 |
+
|
| 377 |
+
|
| 378 |
+
# Create Gradio interface
|
| 379 |
+
demo = gr.TabbedInterface(
|
| 380 |
+
[
|
| 381 |
+
gr.ChatInterface(
|
| 382 |
+
fn=chat_with_standards, # See complete implementation above
|
| 383 |
+
title="Chat with Standards",
|
| 384 |
+
description="Ask questions about educational standards. The AI will use MCP tools to find relevant information.",
|
| 385 |
+
examples=["What standards apply to teaching fractions in 3rd grade?", "Find standards for reading comprehension"],
|
| 386 |
+
api_visibility="private", # Hide from MCP server - only expose search and lookup tools
|
| 387 |
+
),
|
| 388 |
+
gr.Interface(
|
| 389 |
+
fn=find_relevant_standards,
|
| 390 |
+
inputs=[
|
| 391 |
+
gr.Textbox(label="Activity Description", placeholder="Describe a learning activity..."),
|
| 392 |
+
gr.Number(label="Max Results", value=5, minimum=1, maximum=20),
|
| 393 |
+
gr.Dropdown(
|
| 394 |
+
label="Grade (optional)",
|
| 395 |
+
choices=["", "K", "01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12", "09-12"],
|
| 396 |
+
value=None,
|
| 397 |
+
info="Select a grade level to filter results"
|
| 398 |
+
),
|
| 399 |
+
],
|
| 400 |
+
outputs=gr.JSON(label="Results"),
|
| 401 |
+
title="Find Relevant Standards",
|
| 402 |
+
description="Search for educational standards relevant to a learning activity.",
|
| 403 |
+
api_name="find_relevant_standards",
|
| 404 |
+
),
|
| 405 |
+
gr.Interface(
|
| 406 |
+
fn=get_standard_details,
|
| 407 |
+
inputs=gr.Textbox(label="Standard ID", placeholder="Enter a standard GUID or identifier..."),
|
| 408 |
+
outputs=gr.JSON(label="Standard Details"),
|
| 409 |
+
title="Get Standard Details",
|
| 410 |
+
description="Retrieve full metadata for a specific standard by its ID.",
|
| 411 |
+
api_name="get_standard_details",
|
| 412 |
+
),
|
| 413 |
+
],
|
| 414 |
+
["Chat", "Search", "Lookup"],
|
| 415 |
+
)
|
| 416 |
+
|
| 417 |
+
if __name__ == "__main__":
|
| 418 |
+
demo.launch(mcp_server=True)
|
| 419 |
+
|
pyproject.toml
CHANGED
|
@@ -3,8 +3,7 @@ name = "common-core-mcp"
|
|
| 3 |
version = "0.1.0"
|
| 4 |
requires-python = ">=3.12"
|
| 5 |
dependencies = [
|
| 6 |
-
"mcp",
|
| 7 |
-
"gradio>=5.0.0,<6.0.0",
|
| 8 |
"pinecone",
|
| 9 |
"python-dotenv",
|
| 10 |
"typer",
|
|
@@ -13,6 +12,7 @@ dependencies = [
|
|
| 13 |
"loguru",
|
| 14 |
"pydantic>=2.0.0",
|
| 15 |
"pydantic-settings>=2.0.0",
|
|
|
|
| 16 |
]
|
| 17 |
|
| 18 |
[project.optional-dependencies]
|
|
|
|
| 3 |
version = "0.1.0"
|
| 4 |
requires-python = ">=3.12"
|
| 5 |
dependencies = [
|
| 6 |
+
"gradio[mcp]>=6.0.0",
|
|
|
|
| 7 |
"pinecone",
|
| 8 |
"python-dotenv",
|
| 9 |
"typer",
|
|
|
|
| 12 |
"loguru",
|
| 13 |
"pydantic>=2.0.0",
|
| 14 |
"pydantic-settings>=2.0.0",
|
| 15 |
+
"huggingface_hub",
|
| 16 |
]
|
| 17 |
|
| 18 |
[project.optional-dependencies]
|
requirements.txt
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
gradio[mcp]>=6.0.0
|
| 2 |
+
pinecone
|
| 3 |
+
python-dotenv
|
| 4 |
+
typer
|
| 5 |
+
requests
|
| 6 |
+
rich
|
| 7 |
+
loguru
|
| 8 |
+
pydantic>=2.0.0
|
| 9 |
+
pydantic-settings>=2.0.0
|
| 10 |
+
huggingface_hub
|
| 11 |
+
|
src/lookup.py
ADDED
|
@@ -0,0 +1,81 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Direct ID lookup implementation for educational standards."""
|
| 2 |
+
|
| 3 |
+
from __future__ import annotations
|
| 4 |
+
|
| 5 |
+
import json
|
| 6 |
+
|
| 7 |
+
from pinecone.exceptions import PineconeException
|
| 8 |
+
|
| 9 |
+
from src.pinecone_client import PineconeClient
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
def get_standard_details_impl(standard_id: str) -> str:
|
| 13 |
+
"""
|
| 14 |
+
Implementation of direct standard lookup by GUID only.
|
| 15 |
+
|
| 16 |
+
This function only accepts GUIDs (_id field) from Pinecone. It does NOT accept
|
| 17 |
+
statement_notation or other identifier formats. Use find_relevant_standards to
|
| 18 |
+
search for standards by content or metadata.
|
| 19 |
+
|
| 20 |
+
Args:
|
| 21 |
+
standard_id: The standard's GUID (_id field) - must be a valid GUID format
|
| 22 |
+
(e.g., "EA60C8D165F6481B90BFF782CE193F93")
|
| 23 |
+
|
| 24 |
+
Returns:
|
| 25 |
+
JSON string with structured response containing standard details
|
| 26 |
+
"""
|
| 27 |
+
# Input validation
|
| 28 |
+
if not standard_id or not standard_id.strip():
|
| 29 |
+
return json.dumps(
|
| 30 |
+
{
|
| 31 |
+
"success": False,
|
| 32 |
+
"results": [],
|
| 33 |
+
"message": "Standard ID cannot be empty",
|
| 34 |
+
"error_type": "invalid_input",
|
| 35 |
+
}
|
| 36 |
+
)
|
| 37 |
+
|
| 38 |
+
try:
|
| 39 |
+
# Initialize client and fetch standard
|
| 40 |
+
client = PineconeClient()
|
| 41 |
+
result = client.fetch_standard(standard_id.strip())
|
| 42 |
+
|
| 43 |
+
# Handle not found
|
| 44 |
+
if result is None:
|
| 45 |
+
return json.dumps(
|
| 46 |
+
{
|
| 47 |
+
"success": False,
|
| 48 |
+
"results": [],
|
| 49 |
+
"message": f"Standard with GUID '{standard_id}' not found. This function only accepts GUIDs (e.g., 'EA60C8D165F6481B90BFF782CE193F93'). For statement notations or other identifiers, use find_relevant_standards with a keyword search instead.",
|
| 50 |
+
"error_type": "not_found",
|
| 51 |
+
}
|
| 52 |
+
)
|
| 53 |
+
|
| 54 |
+
# Format successful result
|
| 55 |
+
response = {
|
| 56 |
+
"success": True,
|
| 57 |
+
"results": [result],
|
| 58 |
+
"message": "Retrieved standard details",
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
return json.dumps(response, indent=2)
|
| 62 |
+
|
| 63 |
+
except PineconeException as e:
|
| 64 |
+
return json.dumps(
|
| 65 |
+
{
|
| 66 |
+
"success": False,
|
| 67 |
+
"results": [],
|
| 68 |
+
"message": f"Pinecone API error: {str(e)}",
|
| 69 |
+
"error_type": "api_error",
|
| 70 |
+
}
|
| 71 |
+
)
|
| 72 |
+
except Exception as e:
|
| 73 |
+
return json.dumps(
|
| 74 |
+
{
|
| 75 |
+
"success": False,
|
| 76 |
+
"results": [],
|
| 77 |
+
"message": f"Unexpected error: {str(e)}",
|
| 78 |
+
"error_type": "api_error",
|
| 79 |
+
}
|
| 80 |
+
)
|
| 81 |
+
|
src/mcp_config.py
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""MCP server configuration module."""
|
| 2 |
+
|
| 3 |
+
from __future__ import annotations
|
| 4 |
+
|
| 5 |
+
from pydantic_settings import BaseSettings, SettingsConfigDict
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
class McpSettings(BaseSettings):
|
| 9 |
+
"""Configuration settings for the MCP server."""
|
| 10 |
+
|
| 11 |
+
model_config = SettingsConfigDict(
|
| 12 |
+
env_file=".env",
|
| 13 |
+
env_file_encoding="utf-8",
|
| 14 |
+
case_sensitive=False,
|
| 15 |
+
extra="ignore",
|
| 16 |
+
)
|
| 17 |
+
|
| 18 |
+
# Pinecone Configuration
|
| 19 |
+
pinecone_api_key: str = ""
|
| 20 |
+
pinecone_index_name: str = "common-core-standards"
|
| 21 |
+
pinecone_namespace: str = "standards"
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
_settings: McpSettings | None = None
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
def get_mcp_settings() -> McpSettings:
|
| 28 |
+
"""Get the singleton MCP settings instance."""
|
| 29 |
+
global _settings
|
| 30 |
+
if _settings is None:
|
| 31 |
+
_settings = McpSettings()
|
| 32 |
+
return _settings
|
| 33 |
+
|
{tools → src}/pinecone_client.py
RENAMED
|
@@ -12,10 +12,10 @@ from loguru import logger
|
|
| 12 |
from pinecone import Pinecone
|
| 13 |
from pinecone.exceptions import PineconeException
|
| 14 |
|
| 15 |
-
from
|
| 16 |
from tools.pinecone_models import PineconeRecord
|
| 17 |
|
| 18 |
-
settings =
|
| 19 |
|
| 20 |
|
| 21 |
class PineconeClient:
|
|
@@ -205,6 +205,8 @@ class PineconeClient:
|
|
| 205 |
"normalized_subject",
|
| 206 |
"publication_status",
|
| 207 |
"parent_id", # Must be omitted when None (Pinecone doesn't accept null)
|
|
|
|
|
|
|
| 208 |
}
|
| 209 |
for field in optional_fields:
|
| 210 |
if record_dict.get(field) is None:
|
|
@@ -212,6 +214,103 @@ class PineconeClient:
|
|
| 212 |
|
| 213 |
return record_dict
|
| 214 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 215 |
@staticmethod
|
| 216 |
def is_uploaded(set_dir: Path) -> bool:
|
| 217 |
"""
|
|
|
|
| 12 |
from pinecone import Pinecone
|
| 13 |
from pinecone.exceptions import PineconeException
|
| 14 |
|
| 15 |
+
from src.mcp_config import get_mcp_settings
|
| 16 |
from tools.pinecone_models import PineconeRecord
|
| 17 |
|
| 18 |
+
settings = get_mcp_settings()
|
| 19 |
|
| 20 |
|
| 21 |
class PineconeClient:
|
|
|
|
| 205 |
"normalized_subject",
|
| 206 |
"publication_status",
|
| 207 |
"parent_id", # Must be omitted when None (Pinecone doesn't accept null)
|
| 208 |
+
"document_id",
|
| 209 |
+
"document_valid",
|
| 210 |
}
|
| 211 |
for field in optional_fields:
|
| 212 |
if record_dict.get(field) is None:
|
|
|
|
| 214 |
|
| 215 |
return record_dict
|
| 216 |
|
| 217 |
+
def search_standards(
|
| 218 |
+
self,
|
| 219 |
+
query_text: str,
|
| 220 |
+
top_k: int = 5,
|
| 221 |
+
grade: str | None = None,
|
| 222 |
+
) -> list[dict]:
|
| 223 |
+
"""
|
| 224 |
+
Perform semantic search over standards.
|
| 225 |
+
|
| 226 |
+
Args:
|
| 227 |
+
query_text: Natural language query
|
| 228 |
+
top_k: Maximum number of results
|
| 229 |
+
grade: Optional grade filter
|
| 230 |
+
|
| 231 |
+
Returns:
|
| 232 |
+
List of result dictionaries with metadata and scores
|
| 233 |
+
"""
|
| 234 |
+
# Build filter dictionary dynamically
|
| 235 |
+
# Always filter to only leaf nodes (actual standards, not parent categories)
|
| 236 |
+
filter_parts = [{"is_leaf": {"$eq": True}}]
|
| 237 |
+
|
| 238 |
+
if grade:
|
| 239 |
+
filter_parts.append({"education_levels": {"$in": [grade]}})
|
| 240 |
+
|
| 241 |
+
filter_dict = None
|
| 242 |
+
if len(filter_parts) == 1:
|
| 243 |
+
filter_dict = filter_parts[0]
|
| 244 |
+
elif len(filter_parts) == 2:
|
| 245 |
+
filter_dict = {"$and": filter_parts}
|
| 246 |
+
|
| 247 |
+
# Build query dictionary
|
| 248 |
+
query_dict: dict[str, Any] = {
|
| 249 |
+
"inputs": {"text": query_text},
|
| 250 |
+
"top_k": top_k * 2, # Get more candidates for reranking
|
| 251 |
+
}
|
| 252 |
+
if filter_dict:
|
| 253 |
+
query_dict["filter"] = filter_dict
|
| 254 |
+
|
| 255 |
+
# Call search with reranking
|
| 256 |
+
results = self.index.search(
|
| 257 |
+
namespace=self.namespace,
|
| 258 |
+
query=query_dict,
|
| 259 |
+
rerank={"model": "bge-reranker-v2-m3", "top_n": top_k, "rank_fields": ["content"]},
|
| 260 |
+
)
|
| 261 |
+
|
| 262 |
+
# Parse results
|
| 263 |
+
hits = results.get("result", {}).get("hits", [])
|
| 264 |
+
parsed_results = []
|
| 265 |
+
for hit in hits:
|
| 266 |
+
result_dict = {
|
| 267 |
+
"_id": hit["_id"],
|
| 268 |
+
"score": hit["_score"],
|
| 269 |
+
**hit.get("fields", {}),
|
| 270 |
+
}
|
| 271 |
+
parsed_results.append(result_dict)
|
| 272 |
+
|
| 273 |
+
return parsed_results
|
| 274 |
+
|
| 275 |
+
def fetch_standard(self, standard_id: str) -> dict | None:
|
| 276 |
+
"""
|
| 277 |
+
Fetch a standard by its GUID (_id field only).
|
| 278 |
+
|
| 279 |
+
This method performs a direct lookup using Pinecone's fetch() API, which only
|
| 280 |
+
works with the standard's GUID (_id field). It does NOT search by statement_notation,
|
| 281 |
+
asn_identifier, or any other metadata fields.
|
| 282 |
+
|
| 283 |
+
Args:
|
| 284 |
+
standard_id: Standard GUID (_id field) - must be the exact GUID format
|
| 285 |
+
(e.g., "EA60C8D165F6481B90BFF782CE193F93")
|
| 286 |
+
|
| 287 |
+
Returns:
|
| 288 |
+
Standard dictionary with metadata, or None if not found
|
| 289 |
+
"""
|
| 290 |
+
result = self.index.fetch(ids=[standard_id], namespace=self.namespace)
|
| 291 |
+
|
| 292 |
+
# Extract vectors from FetchResponse
|
| 293 |
+
# FetchResponse.vectors is a dict mapping ID to Vector objects
|
| 294 |
+
vectors = result.vectors
|
| 295 |
+
|
| 296 |
+
if not vectors or standard_id not in vectors:
|
| 297 |
+
return None
|
| 298 |
+
|
| 299 |
+
vector = vectors[standard_id]
|
| 300 |
+
|
| 301 |
+
# Extract metadata from Vector object
|
| 302 |
+
# Vector has: id, values (embedding), and metadata (dict with all fields)
|
| 303 |
+
metadata = vector.metadata or {}
|
| 304 |
+
vector_id = vector.id
|
| 305 |
+
|
| 306 |
+
# Combine _id with all metadata fields
|
| 307 |
+
record_dict = {
|
| 308 |
+
"_id": vector_id,
|
| 309 |
+
**metadata,
|
| 310 |
+
}
|
| 311 |
+
|
| 312 |
+
return record_dict
|
| 313 |
+
|
| 314 |
@staticmethod
|
| 315 |
def is_uploaded(set_dir: Path) -> bool:
|
| 316 |
"""
|
src/search.py
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Semantic search implementation for educational standards."""
|
| 2 |
+
|
| 3 |
+
from __future__ import annotations
|
| 4 |
+
|
| 5 |
+
import json
|
| 6 |
+
|
| 7 |
+
from pinecone.exceptions import PineconeException
|
| 8 |
+
|
| 9 |
+
from src.pinecone_client import PineconeClient
|
| 10 |
+
|
| 11 |
+
|
| 12 |
+
def find_relevant_standards_impl(
|
| 13 |
+
activity: str,
|
| 14 |
+
max_results: int = 5,
|
| 15 |
+
grade: str | None = None,
|
| 16 |
+
) -> str:
|
| 17 |
+
"""
|
| 18 |
+
Implementation of semantic search over educational standards.
|
| 19 |
+
|
| 20 |
+
Args:
|
| 21 |
+
activity: Description of the learning activity
|
| 22 |
+
max_results: Maximum number of standards to return (default: 5)
|
| 23 |
+
grade: Optional grade level filter (e.g., "K", "01", "05", "09")
|
| 24 |
+
|
| 25 |
+
Returns:
|
| 26 |
+
JSON string with structured response containing matching standards
|
| 27 |
+
"""
|
| 28 |
+
# Input validation
|
| 29 |
+
if not activity or not activity.strip():
|
| 30 |
+
return json.dumps(
|
| 31 |
+
{
|
| 32 |
+
"success": False,
|
| 33 |
+
"results": [],
|
| 34 |
+
"message": "Activity description cannot be empty",
|
| 35 |
+
"error_type": "invalid_input",
|
| 36 |
+
}
|
| 37 |
+
)
|
| 38 |
+
|
| 39 |
+
try:
|
| 40 |
+
# Initialize client and perform search
|
| 41 |
+
client = PineconeClient()
|
| 42 |
+
results = client.search_standards(
|
| 43 |
+
query_text=activity.strip(),
|
| 44 |
+
top_k=max_results,
|
| 45 |
+
grade=grade,
|
| 46 |
+
)
|
| 47 |
+
|
| 48 |
+
# Handle empty results
|
| 49 |
+
if not results:
|
| 50 |
+
return json.dumps(
|
| 51 |
+
{
|
| 52 |
+
"success": False,
|
| 53 |
+
"results": [],
|
| 54 |
+
"message": "No matching standards found",
|
| 55 |
+
"error_type": "no_results",
|
| 56 |
+
}
|
| 57 |
+
)
|
| 58 |
+
|
| 59 |
+
# Format successful results
|
| 60 |
+
response = {
|
| 61 |
+
"success": True,
|
| 62 |
+
"results": results,
|
| 63 |
+
"message": f"Found {len(results)} matching standards",
|
| 64 |
+
}
|
| 65 |
+
|
| 66 |
+
return json.dumps(response, indent=2)
|
| 67 |
+
|
| 68 |
+
except PineconeException as e:
|
| 69 |
+
return json.dumps(
|
| 70 |
+
{
|
| 71 |
+
"success": False,
|
| 72 |
+
"results": [],
|
| 73 |
+
"message": f"Pinecone API error: {str(e)}",
|
| 74 |
+
"error_type": "api_error",
|
| 75 |
+
}
|
| 76 |
+
)
|
| 77 |
+
except Exception as e:
|
| 78 |
+
return json.dumps(
|
| 79 |
+
{
|
| 80 |
+
"success": False,
|
| 81 |
+
"results": [],
|
| 82 |
+
"message": f"Unexpected error: {str(e)}",
|
| 83 |
+
"error_type": "api_error",
|
| 84 |
+
}
|
| 85 |
+
)
|
| 86 |
+
|
tools/cli.py
CHANGED
|
@@ -470,7 +470,7 @@ def pinecone_init():
|
|
| 470 |
Uses integrated embeddings with llama-text-embed-v2 model.
|
| 471 |
"""
|
| 472 |
try:
|
| 473 |
-
from
|
| 474 |
|
| 475 |
console.print("[bold]Initializing Pinecone...[/bold]")
|
| 476 |
|
|
@@ -555,7 +555,7 @@ def pinecone_upload(
|
|
| 555 |
If neither is provided, you'll be prompted to confirm uploading all sets.
|
| 556 |
"""
|
| 557 |
try:
|
| 558 |
-
from
|
| 559 |
from tools.pinecone_models import ProcessedStandardSet
|
| 560 |
import json
|
| 561 |
|
|
|
|
| 470 |
Uses integrated embeddings with llama-text-embed-v2 model.
|
| 471 |
"""
|
| 472 |
try:
|
| 473 |
+
from src.pinecone_client import PineconeClient
|
| 474 |
|
| 475 |
console.print("[bold]Initializing Pinecone...[/bold]")
|
| 476 |
|
|
|
|
| 555 |
If neither is provided, you'll be prompted to confirm uploading all sets.
|
| 556 |
"""
|
| 557 |
try:
|
| 558 |
+
from src.pinecone_client import PineconeClient
|
| 559 |
from tools.pinecone_models import ProcessedStandardSet
|
| 560 |
import json
|
| 561 |
|
tools/pinecone_models.py
CHANGED
|
@@ -31,8 +31,8 @@ class PineconeRecord(BaseModel):
|
|
| 31 |
subject: str
|
| 32 |
normalized_subject: str | None = None
|
| 33 |
education_levels: list[str]
|
| 34 |
-
document_id: str
|
| 35 |
-
document_valid: str
|
| 36 |
publication_status: str | None = None
|
| 37 |
jurisdiction_id: str
|
| 38 |
jurisdiction_title: str
|
|
|
|
| 31 |
subject: str
|
| 32 |
normalized_subject: str | None = None
|
| 33 |
education_levels: list[str]
|
| 34 |
+
document_id: str | None = None
|
| 35 |
+
document_valid: str | None = None
|
| 36 |
publication_status: str | None = None
|
| 37 |
jurisdiction_id: str
|
| 38 |
jurisdiction_title: str
|