# Stable Deployment Plan: Public Testing This document outlines a reliable and robust strategy for deploying the VedaMD Clinical Assistant for public testing, with the backend hosted on Hugging Face Spaces and the frontend on Netlify. ## 1. Background and Motivation Previous deployment attempts have been plagued by resource exhaustion and dependency conflicts on Hugging Face Spaces. The primary issue was attempting to perform a heavy, one-time build task (creating the vector store) during application startup in a resource-constrained environment. This new plan decouples the build process from the runtime process, which is a standard best practice for deploying ML applications. ## 2. High-Level Architecture - **Vector Store Creation**: A local script will generate the FAISS index and associated metadata. - **Artifact Hosting**: The generated vector store artifacts will be uploaded to a new model repository on the Hugging Face Hub using Git LFS. - **Backend (Hugging Face Space)**: A lightweight FastAPI application that downloads the vector store from the Hub and serves the RAG API. It will not perform any on-the-fly processing. - **Frontend (Netlify)**: The existing Next.js application, configured to point to the new, stable backend API endpoint. ## 3. Key Advantages of This Approach - **Reliability**: The backend will have a fast and predictable startup time, as it's only downloading files, not computing them. - **Scalability**: The heavy lifting is done offline. The online component is lightweight and can handle API requests efficiently. - **Maintainability**: Separating concerns makes debugging and updating each component (vector store, backend, frontend) much easier. - **Cost-Effectiveness**: We can continue to use the free tiers for both Hugging Face Spaces and Netlify. ## 4. High-level Task Breakdown This plan is broken down into clear, verifiable steps. We will proceed one step at a time. - [x] **Task 1: Pre-compute Vector Store Locally** - *Update*: A complete vector store was found at `src/vector_store`. We can use this existing artifact and do not need to re-compute it. - [x] **Task 2: Upload Vector Store to Hugging Face Hub** - *Update*: Successfully uploaded the vector store files to the `sniro23/VedaMD-Vector-Store` repository on the Hugging Face Hub. - [x] **Task 3: Refactor Backend to Load from Hub** - *Update*: The backend has been successfully refactored. The `simple_vector_store.py` and `groq_medical_rag.py` modules now load the pre-computed index directly from the Hub. The `Dockerfile` and `requirements.txt` have been streamlined, removing all build-time dependencies. - [x] **Task 4: Deploy a New, Clean Backend Space** - *Update*: Successfully created and deployed a new, private Docker-based Space at `sniro23/VedaMD-Backend-v2`. The application is now running and logs can be monitored on the Hub. - [x] **Task 5: Configure and Deploy Frontend to Netlify** - *Update*: The frontend has been configured to connect to the new backend endpoint. The changes have been pushed, and a new deployment has been triggered on Netlify. The application should now be fully operational. --- **Deployment Complete!** The VedaMD Clinical Assistant is now running with a stable, decoupled architecture. ## 5. Post-Deployment Issues & Fixes - **Issue (2024-07-26):** Encountered a persistent `PermissionError: [Errno 13] Permission denied` on Hugging Face Spaces. The application, running as a non-root user, could not write to the cache directory (`/app/data`) because it was created by the `root` user during the Docker build. - **Solution:** The `Dockerfile` was updated to explicitly create a non-root `user`, create the `/app/data` directory, and then transfer ownership of that directory to the `user` with `chown`. This ensures the application has the necessary write permissions at runtime.