---
title: NU-CS Policy RAG (Qwen 3B CPU)
emoji: 📚
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
---

# NU-CS Policy RAG (Qwen 3B • CPU • GGUF) — FastAPI Microservice

This Space hosts a **bilingual Retrieval-Augmented Generation (RAG) microservice** for
Nile University Computer Science policy FAQs.

- **Retrieval:** FAISS + `intfloat/multilingual-e5-base`
- **LLM:** Qwen2.5-3B-Instruct (quantized **GGUF**) via **llama.cpp** (CPU)
- **API Framework:** FastAPI
- **Endpoint:** `POST /ask`

The service answers questions in **Arabic or English**, strictly grounded in the indexed
policy Q&A JSON files. If the answer is not present in the corpus, it returns a fixed
“insufficient context” message.

---

## Project Structure

```text
.
├── api.py          # FastAPI app (exposes /ask)
├── app.py          # Entry point (runs uvicorn on port 7860)
├── rag_core.py     # RAG logic: indexing, retrieval, LLM generation
├── requirements.txt
├── README.md
└── data/
    └── pages/      # Policy Q&A JSON files (e.g. 44 files)

At runtime, the service also creates:

artifacts/
  ├── policy.index       # FAISS index
  └── policy_docs.pkl    # Serialized passage metadata
models/
  └── ...                # Downloaded GGUF model from the Hub
```

## Deploy on Hugging Face Spaces

1. Create a new **Space**
   - Type: **Python** (SDK can stay as shown in the YAML header)
   - Hardware: **CPU Basic** (free tier)
2. Add the following files to the Space repository:
   - `api.py`
   - `app.py`
   - `rag_core.py`
   - `requirements.txt`
   - `README.md`
3. Create the data folder structure:
```
mkdir -p data/pages
```
4. Commit your **policy JSON files** into `data/pages/`
   - (for example, 44 files like `page_001.json`, `page_002`.json, ...).

5. Push to the Space. On the first run the Space will:
   - Load all JSON files from `./data/pages`
   - Build a FAISS index and save it under `./artifacts/`
   - Download the GGUF model from the Hub to `./models/`

As long as `app.py` starts a server on port **7860**, Spaces will route traffic to it.

---

### Environment Variables
You can control behavior using environment variables (e.g. in **Settings** → **Variables & secrets** on Hugging Face Spaces):
   - `DATA_DIR` — path to JSON files (default: `./data/pages`)
   - `INDEX_PATH` — FAISS index path (default: `./artifacts/policy.index`)
   - `DOC_STORE_PATH` — pickled documents path (default: `./artifacts/policy_docs.pkl`)
   - `EMBED_MODEL` — sentence transformer model (default: `intfloat/multilingual-e5-base`)
   - `GGUF_REPO_ID` — GGUF repo on the Hub default: `Qwen/Qwen2.5-3B-Instruct-GGUF`
   - `GGUF_FILENAME` — GGUF filename default: `qwen2.5-3b-instruct-q4_k_m.gguf`
   - `TOP_K` — default number of passages to retrieve (default: `5`)
   - `MAX_CTX_CHARS` — max characters of context sent to the LLM (default: `5000`)
   - `N_CTX` — model context size (default: `4096`)
   - `MAX_NEW_TOKENS` — max tokens generated by the LLM (default: `140`)

## API Documentation

Once the Space is running, the FastAPI docs are available at:

- Interactive Swagger UI:
   - `https://<Abdelrahman-a99>-<nu-policy-rag>.hf.space/docs`
- Raw OpenAPI JSON:
   - `https://<Abdelrahman-a99>-<nu-policy-rag>.hf.space/openapi.json`

## Running locally

You can run the microservice on your own machine for development/testing.

### 1.Setup

```bash
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
```

Prepare data folders:

```bash
mkdir -p data/pages artifacts
# copy your JSON policy files into data/pages/
```

### 2. Start the API

Using uvicorn directly:

```bash
uvicorn api:app --host 0.0.0.0 --port 8000
```

Or via `app.py` (same as HF Spaces):

```bash
python app.py    # listens on port 7860
```

### 3. Test the Endpoint

```bash
curl -X POST "http://localhost:8000/ask" \
     -H "Content-Type: application/json" \
     -d '{"question": "What is the attendance policy?", "top_k": 5}'
```

Or open the Swagger UI in your browser:

- `http://localhost:8000/docs`
- or `http://localhost:7860/docs` (if using `python app.py`)