Spaces:
Sleeping
Sleeping
bigwolfe
commited on
Commit
·
c78c0b5
1
Parent(s):
ad91a91
llama
Browse files- docs/llama.md +64 -0
docs/llama.md
ADDED
|
@@ -0,0 +1,64 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
## Gemini LlamaIndex Vault Chat Agent Spec
|
| 2 |
+
|
| 3 |
+
Feature spec: Gemini + LlamaIndex Vault Chat Agent (HF Space)
|
| 4 |
+
|
| 5 |
+
Objective
|
| 6 |
+
|
| 7 |
+
Add a second planning chat interface to the Hugging Face Space.
|
| 8 |
+
Use LlamaIndex for retrieval-augmented generation over the same Markdown vault used by Document-MCP.
|
| 9 |
+
Use Gemini (via LlamaIndex) as both the LLM and embedding model.
|
| 10 |
+
Optionally allow the agent to write new notes into the vault via constrained tools.
|
| 11 |
+
|
| 12 |
+
Non-goals
|
| 13 |
+
|
| 14 |
+
Do not change the existing MCP server or ChatGPT App widget behavior.
|
| 15 |
+
Do not introduce a new external database; rely on LlamaIndex storage or simple filesystem persistence for the hackathon.
|
| 16 |
+
|
| 17 |
+
High-Level Architecture
|
| 18 |
+
|
| 19 |
+
Vault: directory of Markdown notes already used by Document-MCP.
|
| 20 |
+
Indexer: Python module using LlamaIndex to scan the vault, split notes into chunks, and build a VectorStoreIndex backed by a simple vector store.
|
| 21 |
+
Chat backend: FastAPI endpoints that load the index, run RAG queries with Gemini, and return answers plus source notes.
|
| 22 |
+
HF Space frontend: a new chat panel that calls the backend, shows the assistant response, and lists linked sources (note titles and paths).
|
| 23 |
+
|
| 24 |
+
Backend details
|
| 25 |
+
|
| 26 |
+
Dependencies: llama-index core, llama-index-llms-google-genai, llama-index-embeddings-google-genai.
|
| 27 |
+
Env vars: GOOGLE_API_KEY, VAULT_DIR, LLAMAINDEX_PERSIST_DIR.
|
| 28 |
+
On startup: if a persisted index exists, load it; otherwise, scan VAULT_DIR for markdown files, build a new index, and persist it under LLAMAINDEX_PERSIST_DIR.
|
| 29 |
+
Provide a helper get_or_build_index that returns a singleton VectorStoreIndex.
|
| 30 |
+
Implement a function rag_chat(messages) that:
|
| 31 |
+
Takes a simple chat history array.
|
| 32 |
+
Uses index.as_query_engine with Gemini as the LLM.
|
| 33 |
+
Runs a query on the latest user message.
|
| 34 |
+
Returns a dict with fields: answer (string), sources (list of title, path, snippet), notes_written (empty list for now).
|
| 35 |
+
Expose POST /api/rag/chat in FastAPI that wraps rag_chat.
|
| 36 |
+
|
| 37 |
+
Frontend details
|
| 38 |
+
|
| 39 |
+
Add a new panel or tab labeled Gemini Planning Agent.
|
| 40 |
+
Layout: left side may keep the existing docs UI; right side is a chat view.
|
| 41 |
+
Chat view: list of messages and a composer textarea with a Send button.
|
| 42 |
+
On send: push the user message into local history, POST to /api/rag/chat, then append the assistant answer and its sources when the response arrives.
|
| 43 |
+
Under each assistant message, show a collapsible Sources section; clicking a source should either open the note in the existing viewer or show the snippet inline.
|
| 44 |
+
|
| 45 |
+
Index refresh strategy
|
| 46 |
+
|
| 47 |
+
On every backend startup, attempt to load an existing index; rebuild if missing or invalid.
|
| 48 |
+
For hackathon scale, it is acceptable that index updates require a restart or redeploy.
|
| 49 |
+
|
| 50 |
+
Phase 2 (optional write tools)
|
| 51 |
+
|
| 52 |
+
Implement safe note-writing helpers (create_note, append_to_note, tag_note) that operate only in a dedicated agent folder inside the vault.
|
| 53 |
+
Register these as tools for a LlamaIndex-based agent using Gemini as the reasoning model.
|
| 54 |
+
Extend /api/rag/chat so that responses can include notes_written metadata when the agent creates or updates notes.
|
| 55 |
+
In the UI, show a small badge when a new note is created, with a link into the vault viewer.
|
| 56 |
+
|
| 57 |
+
Implementation order
|
| 58 |
+
|
| 59 |
+
Wire dependencies and environment variables.
|
| 60 |
+
Implement get_or_build_index and verify indexing works.
|
| 61 |
+
Implement rag_chat and the /api/rag/chat endpoint.
|
| 62 |
+
Build the frontend chat UI and hook it up to the endpoint.
|
| 63 |
+
If time allows, add Phase 2 tools and surface created notes in the UI.
|
| 64 |
+
|