neobot / README.md
yadavkapil23's picture
new commit
b63ad23
---
title: RAG Project
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8000
python_version: 3.10
---
# πŸš€ RAG System with LangChain and FastAPI 🌐
Welcome to this repository! This project demonstrates how to build a powerful RAG system using **LangChain** and **FastAPI** for generating contextually relevant and accurate responses by integrating external data into the generative process.
## πŸ“‹ Project Overview
The RAG system combines retrieval and generation to provide smarter AI-driven responses. Using **LangChain** for document handling and embeddings, and **FastAPI** for deploying a fast, scalable API, this project includes:
- πŸ—‚οΈ **Document Loading**: Load data from various sources (text, PDFs, etc.).
- βœ‚οΈ **Text Splitting**: Break large documents into manageable chunks.
- 🧠 **Embeddings**: Generate vector embeddings for efficient search and retrieval.
- πŸ” **Vector Stores**: Store embeddings in a vector store for fast similarity searches.
- πŸ”§ **Retrieval**: Retrieve the most relevant document chunks based on user queries.
- πŸ’¬ **Generative Response**: Use retrieved data with language models (LLMs) to generate accurate, context-aware answers.
- 🌐 **FastAPI**: Deploy the RAG system as a scalable API for easy interaction.
## βš™οΈ Setup and Installation
### Prerequisites
Make sure you have the following installed:
- 🐍 Python 3.10+
- 🐳 Docker (optional, for deployment)
- πŸ› οΈ PostgreSQL or FAISS (for vector storage)
### Installation Steps
1. **Clone the repository**:
```bash
git clone https://github.com/yadavkapil23/RAG_Project.git
```
2. **Set up a virtual environment**:
```bash
python -m venv venv
source venv/bin/activate # For Linux/Mac
venv\Scripts\activate # For Windows
```
3. **Install dependencies**:
```bash
pip install -r requirements.txt
```
4. **Run the FastAPI server**:
```bash
uvicorn main:app --reload
```
Now, your FastAPI app will be running at `http://127.0.0.1:8000` πŸŽ‰!
### Set up Ollama πŸ¦™
This project uses Ollama to run local large language models.
1. **Install Ollama:** Follow the instructions on the [Ollama website](https://ollama.ai/) to download and install Ollama.
2. **Pull a model:** Pull a model to use with the application. This project uses `llama3`.
```bash
ollama pull llama3
```
## πŸ› οΈ Features
- **Retrieval-Augmented Generation**: Combines the best of both worldsβ€”retrieving relevant data and generating insightful responses.
- **Scalable API**: FastAPI makes it easy to deploy and scale the RAG system.
- **Document Handling**: Supports multiple document types for loading and processing.
- **Vector Embeddings**: Efficient search with FAISS or other vector stores.
## πŸ›‘οΈ Security
- πŸ” **OAuth2 and API Key** authentication support for secure API access.
- πŸ”’ **TLS/SSL** for encrypting data in transit.
- πŸ›‘οΈ **Data encryption** for sensitive document storage.
## πŸš€ Deployment
### Hugging Face Spaces (Docker) Deployment
This project is configured for a Hugging Face Space using the Docker runtime.
1. Push this repository to GitHub (or connect local).
2. Create a new Space on Hugging Face β†’ Choose "Docker" SDK.
3. Point it to this repo. Spaces will build using the `Dockerfile` and run `uvicorn` binding to the provided `PORT`.
4. Ensure the file `data/sample.pdf` exists (or replace it) to allow FAISS index creation on startup.
Notes:
- Models `Qwen/Qwen2-0.5B-Instruct` and `all-MiniLM-L6-v2` will be downloaded on first run; initial cold start may take several minutes.
- Dependencies are CPU-friendly; no GPU is required.
- If you see OOM, consider reducing `max_new_tokens` in `vector_rag.py` or swapping to an even smaller instruct model.
### Docker Deployment (Local)
If you want to deploy your RAG system using Docker, simply build the Docker image and run the container:
```bash
docker build -t rag-system .
docker run -p 8000:8000 rag-system
```
### Cloud Deployment
Deploy your RAG system to the cloud using platforms like **AWS**, **Azure**, or **Google Cloud** with minimal setup.
## 🧠 Future Enhancements
- πŸ”„ **Real-time Data Integration**: Add real-time data sources for dynamic responses.
- πŸ€– **Advanced Retrieval Techniques**: Implement deep learning-based retrievers for better query understanding.
- πŸ“Š **Monitoring Tools**: Add monitoring with tools like Prometheus or Grafana for performance insights.
## 🀝 Contributing
Want to contribute? Feel free to fork this repository, submit a pull request, or open an issue. We welcome all contributions! πŸ› οΈ
## πŸ“„ License
This project is licensed under the MIT License.
---
πŸŽ‰ **Thank you for checking out the RAG System with LangChain and FastAPI!** If you have any questions or suggestions, feel free to reach out or open an issue. Let's build something amazing!