Spaces:

MCP-1st-Birthday
/

Vault.MCP

Sleeping

App Files Files Community

bigwolfe commited on 29 days ago

Commit

c78c0b5

1 Parent(s): ad91a91

llama

Browse files

Files changed (1) hide show

docs/llama.md +64 -0

docs/llama.md ADDED Viewed

	@@ -0,0 +1,64 @@

+## Gemini LlamaIndex Vault Chat Agent Spec
+Feature spec: Gemini + LlamaIndex Vault Chat Agent (HF Space)
+    Objective
+    Add a second planning chat interface to the Hugging Face Space.
+    Use LlamaIndex for retrieval-augmented generation over the same Markdown vault used by Document-MCP.
+    Use Gemini (via LlamaIndex) as both the LLM and embedding model.
+    Optionally allow the agent to write new notes into the vault via constrained tools.
+Non-goals
+    Do not change the existing MCP server or ChatGPT App widget behavior.
+    Do not introduce a new external database; rely on LlamaIndex storage or simple filesystem persistence for the hackathon.
+    High-Level Architecture
+    Vault: directory of Markdown notes already used by Document-MCP.
+    Indexer: Python module using LlamaIndex to scan the vault, split notes into chunks, and build a VectorStoreIndex backed by a simple vector store.
+    Chat backend: FastAPI endpoints that load the index, run RAG queries with Gemini, and return answers plus source notes.
+    HF Space frontend: a new chat panel that calls the backend, shows the assistant response, and lists linked sources (note titles and paths).
+    Backend details
+    Dependencies: llama-index core, llama-index-llms-google-genai, llama-index-embeddings-google-genai.
+    Env vars: GOOGLE_API_KEY, VAULT_DIR, LLAMAINDEX_PERSIST_DIR.
+    On startup: if a persisted index exists, load it; otherwise, scan VAULT_DIR for markdown files, build a new index, and persist it under LLAMAINDEX_PERSIST_DIR.
+    Provide a helper get_or_build_index that returns a singleton VectorStoreIndex.
+    Implement a function rag_chat(messages) that:
+        Takes a simple chat history array.
+        Uses index.as_query_engine with Gemini as the LLM.
+        Runs a query on the latest user message.
+        Returns a dict with fields: answer (string), sources (list of title, path, snippet), notes_written (empty list for now).
+    Expose POST /api/rag/chat in FastAPI that wraps rag_chat.
+    Frontend details
+    Add a new panel or tab labeled Gemini Planning Agent.
+    Layout: left side may keep the existing docs UI; right side is a chat view.
+    Chat view: list of messages and a composer textarea with a Send button.
+    On send: push the user message into local history, POST to /api/rag/chat, then append the assistant answer and its sources when the response arrives.
+    Under each assistant message, show a collapsible Sources section; clicking a source should either open the note in the existing viewer or show the snippet inline.
+    Index refresh strategy
+    On every backend startup, attempt to load an existing index; rebuild if missing or invalid.
+    For hackathon scale, it is acceptable that index updates require a restart or redeploy.
+    Phase 2 (optional write tools)
+    Implement safe note-writing helpers (create_note, append_to_note, tag_note) that operate only in a dedicated agent folder inside the vault.
+    Register these as tools for a LlamaIndex-based agent using Gemini as the reasoning model.
+    Extend /api/rag/chat so that responses can include notes_written metadata when the agent creates or updates notes.
+    In the UI, show a small badge when a new note is created, with a link into the vault viewer.
+    Implementation order
+    Wire dependencies and environment variables.
+    Implement get_or_build_index and verify indexing works.
+    Implement rag_chat and the /api/rag/chat endpoint.
+    Build the frontend chat UI and hook it up to the endpoint.
+    If time allows, add Phase 2 tools and surface created notes in the UI.