Spaces:
Running
Running
Update KAIGUIDE.md with comprehensive documentation
Browse files- Add Section 6: Response Sanitization (spam patterns, UI artifacts)
- Add Section 7: Provider Session Management via Supabase
* Architecture and usage patterns
* Setup instructions
* Implementation status per provider
* Session limits table
- Add Section 9: Common Issues (login emails, modals, sanitization)
- Add Section 10: Tips & Tricks
* Browser provider best practices
* Model naming conventions
* Steps for adding new providers
* Testing recommendations
- Updated provider sections with current status
- Last Updated: 2026-02-14
- KAIGUIDE.md +137 -17
KAIGUIDE.md
CHANGED
|
@@ -5,18 +5,18 @@
|
|
| 5 |
---
|
| 6 |
|
| 7 |
## 1. System Architecture
|
| 8 |
-
The API uses a **Strict Engine** (`engine.py`) that routes requests to providers (`g4f`, `pollinations`).
|
| 9 |
- **Adaptive Fallback**: By default (`provider="auto"`), the engine tries models in `MODEL_RANKING` order.
|
| 10 |
- **Strict Mode**: If `model` or `provider` is specified, the engine tries **ONLY** that combination. No fallback.
|
| 11 |
|
| 12 |
## 2. Deployment
|
| 13 |
|
| 14 |
### A. Vercel (Serverless - Default)
|
| 15 |
-
Fast, free, but NO browser support (**Z.ai disabled**).
|
| 16 |
See `vercel.json` and hacks below.
|
| 17 |
|
| 18 |
-
### B. Hugging Face Spaces (Docker -
|
| 19 |
-
Use this for **Z.ai** (requires browser).
|
| 20 |
- See [README_DOCKER.md](file:///Users/mac/KAI_API/README_DOCKER.md)
|
| 21 |
- Supports full browser automation.
|
| 22 |
|
|
@@ -35,6 +35,8 @@ if os.environ.get("VERCEL") or True:
|
|
| 35 |
```
|
| 36 |
**DO NOT REMOVE THIS.** It prevents `[Errno 30] Read-only file system` crashes.
|
| 37 |
|
|
|
|
|
|
|
| 38 |
## 3. Provider Specifics
|
| 39 |
|
| 40 |
### A. G4F (Scraping Layer)
|
|
@@ -56,7 +58,7 @@ Uses Playwright Chromium to interact with `chat.z.ai` as a real browser.
|
|
| 56 |
- **Model**: `glm-5` (default, reasoning model), `glm-4-flash`.
|
| 57 |
- **Key Headers**: `x-fe-version: prod-fe-1.0.237`, `x-signature: <sha256>`, `Authorization: Bearer <JWT>`.
|
| 58 |
- **Speed**: ~5-15s per request (browser startup + DOM scraping).
|
| 59 |
-
- **Vercel**: **DISABLED** (no Chromium in serverless). Local only.
|
| 60 |
- **Files**: `providers/zai_provider.py`, `test_zai_browser.py`, `zai_captured.json`.
|
| 61 |
|
| 62 |
### D. Gemini (Browser-Based Provider)
|
|
@@ -71,18 +73,22 @@ Uses Playwright Chromium to interact with `gemini.google.com` as a real browser.
|
|
| 71 |
### E. HuggingChat (Browser-Based Provider)
|
| 72 |
Uses Playwright Chromium to interact with `huggingface.co/chat` as a real browser.
|
| 73 |
- **Why Browser**: HuggingChat provides access to 100+ open-source models via web interface.
|
|
|
|
|
|
|
|
|
|
| 74 |
- **Input**: `textarea` with placeholder text.
|
| 75 |
- **Features**:
|
| 76 |
- Handles the welcome modal automatically (clicks "Start chatting")
|
| 77 |
-
- Supports model selection from dropdown
|
| 78 |
-
- Access to top models: Llama 3.3 70B, Qwen 2.5 72B, DeepSeek R1, etc.
|
| 79 |
-
- **Models
|
| 80 |
-
- `omni` - Auto-routes to best model (default)
|
| 81 |
-
- `
|
| 82 |
-
- `
|
| 83 |
-
- `deepseek-
|
| 84 |
-
-
|
| 85 |
-
- **
|
|
|
|
| 86 |
- **Vercel**: **DISABLED** (no Chromium in serverless). Local/Docker only.
|
| 87 |
|
| 88 |
### F. Search & Deep Research
|
|
@@ -94,19 +100,95 @@ The API includes a search engine (`search_engine.py`) powered by DuckDuckGo (via
|
|
| 94 |
3. Scrapes results.
|
| 95 |
4. Synthesizes a final answer using the AI Engine.
|
| 96 |
|
|
|
|
|
|
|
| 97 |
## 4. Frontend & Admin
|
| 98 |
- **`static/docs.html`**: The public landing page AND the "Try It" dashboard.
|
| 99 |
- **`static/admin.html`**: Secret admin panel (`/qazmlp`) for checking stats and running tests.
|
| 100 |
- **Stats**: Stored in Supabase (persisted across Vercel cold starts).
|
| 101 |
|
|
|
|
|
|
|
| 102 |
## 5. Debugging Tools
|
| 103 |
We have built-in tools to diagnose issues on Vercel:
|
| 104 |
- **`/admin/debug_g4f`**: Runs a live G4F test (`gpt-4o-mini`, `gpt-4`) and returns verbose logs.
|
| 105 |
- *Note*: Uses `AsyncClient` to avoid "Event loop already running" errors.
|
| 106 |
- **`/admin/test_all`**: Runs a parallel check on all configured models.
|
| 107 |
- **`debug_g4f_verbose.py`**: Local script for deep inspection.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
|
| 109 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 110 |
- **Adding Models**: Run `@.agent/workflows/update.md`.
|
| 111 |
- *Crucial*: Always run `step 3.6` (Strict Mode Verification) after updates.
|
| 112 |
- **Strict Mode Validation**: Run `python3 test_strict.py`.
|
|
@@ -115,13 +197,51 @@ We have built-in tools to diagnose issues on Vercel:
|
|
| 115 |
- Previous blocker (x-signature) solved via Playwright browser automation.
|
| 116 |
- Provider: `providers/zai_provider.py`, Model: `glm-5` (Tier 1).
|
| 117 |
|
| 118 |
-
|
|
|
|
|
|
|
| 119 |
| Error | Cause | Fix |
|
| 120 |
| :--- | :--- | :--- |
|
| 121 |
| `[Errno 30] Read-only file system` | `HOME` not set to `/tmp` | Ensure `os.environ["HOME"] = "/tmp"` is at top of `main.py`. |
|
| 122 |
| `Event loop already running` | Sync `Client` in async handler | Use `g4f.client.AsyncClient`. |
|
| 123 |
| `Add a "api_key"` | Provider requires auth | The provider (e.g. OpenRouter) is active but we have no key. Use `strict` mode to avoid it, or rely on `ApiAirforce`. |
|
| 124 |
| `Model not found: auto` | `model="auto"` passed | `engine.py` must handle `model="auto"` as `None`. |
|
|
|
|
|
|
|
|
|
|
| 125 |
|
| 126 |
---
|
| 127 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
---
|
| 6 |
|
| 7 |
## 1. System Architecture
|
| 8 |
+
The API uses a **Strict Engine** (`engine.py`) that routes requests to providers (`g4f`, `pollinations`, `zai`, `gemini`, `huggingchat`).
|
| 9 |
- **Adaptive Fallback**: By default (`provider="auto"`), the engine tries models in `MODEL_RANKING` order.
|
| 10 |
- **Strict Mode**: If `model` or `provider` is specified, the engine tries **ONLY** that combination. No fallback.
|
| 11 |
|
| 12 |
## 2. Deployment
|
| 13 |
|
| 14 |
### A. Vercel (Serverless - Default)
|
| 15 |
+
Fast, free, but NO browser support (**Z.ai, Gemini, HuggingChat disabled**).
|
| 16 |
See `vercel.json` and hacks below.
|
| 17 |
|
| 18 |
+
### B. Hugging Face Spaces (Docker - Full Browser Support)
|
| 19 |
+
Use this for **Z.ai, Gemini, HuggingChat** (requires browser).
|
| 20 |
- See [README_DOCKER.md](file:///Users/mac/KAI_API/README_DOCKER.md)
|
| 21 |
- Supports full browser automation.
|
| 22 |
|
|
|
|
| 35 |
```
|
| 36 |
**DO NOT REMOVE THIS.** It prevents `[Errno 30] Read-only file system` crashes.
|
| 37 |
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
## 3. Provider Specifics
|
| 41 |
|
| 42 |
### A. G4F (Scraping Layer)
|
|
|
|
| 58 |
- **Model**: `glm-5` (default, reasoning model), `glm-4-flash`.
|
| 59 |
- **Key Headers**: `x-fe-version: prod-fe-1.0.237`, `x-signature: <sha256>`, `Authorization: Bearer <JWT>`.
|
| 60 |
- **Speed**: ~5-15s per request (browser startup + DOM scraping).
|
| 61 |
+
- **Vercel**: **DISABLED** (no Chromium in serverless). Local/Docker only.
|
| 62 |
- **Files**: `providers/zai_provider.py`, `test_zai_browser.py`, `zai_captured.json`.
|
| 63 |
|
| 64 |
### D. Gemini (Browser-Based Provider)
|
|
|
|
| 73 |
### E. HuggingChat (Browser-Based Provider)
|
| 74 |
Uses Playwright Chromium to interact with `huggingface.co/chat` as a real browser.
|
| 75 |
- **Why Browser**: HuggingChat provides access to 100+ open-source models via web interface.
|
| 76 |
+
- **Authentication**: Uses credentials (stored in provider) - logs in automatically.
|
| 77 |
+
- **Session Management**: Uses Supabase to persist cookies across redeploys (see Section 7).
|
| 78 |
+
- **New Conversation Each Time**: Clicks "New Chat" to ensure no context sharing between API calls.
|
| 79 |
- **Input**: `textarea` with placeholder text.
|
| 80 |
- **Features**:
|
| 81 |
- Handles the welcome modal automatically (clicks "Start chatting")
|
| 82 |
+
- Supports model selection from dropdown
|
| 83 |
+
- Access to top models: Llama 3.3 70B, Qwen 2.5 72B, DeepSeek R1, Kimi K2, etc.
|
| 84 |
+
- **Models** (all prefixed with `huggingface-`):
|
| 85 |
+
- `huggingface-omni` - Auto-routes to best model (default)
|
| 86 |
+
- `huggingface-llama-3.3-70b` - Meta's latest Llama model
|
| 87 |
+
- `huggingface-qwen-72b` - Alibaba's Qwen model
|
| 88 |
+
- `huggingface-deepseek-r1` - DeepSeek reasoning model
|
| 89 |
+
- `huggingface-kimi-k2` - Moonshot's Kimi K2 model
|
| 90 |
+
- **Files**: `providers/huggingchat_provider.py`, `provider_sessions.py`.
|
| 91 |
+
- **Status**: **Working**. Requires local Playwright environment.
|
| 92 |
- **Vercel**: **DISABLED** (no Chromium in serverless). Local/Docker only.
|
| 93 |
|
| 94 |
### F. Search & Deep Research
|
|
|
|
| 100 |
3. Scrapes results.
|
| 101 |
4. Synthesizes a final answer using the AI Engine.
|
| 102 |
|
| 103 |
+
---
|
| 104 |
+
|
| 105 |
## 4. Frontend & Admin
|
| 106 |
- **`static/docs.html`**: The public landing page AND the "Try It" dashboard.
|
| 107 |
- **`static/admin.html`**: Secret admin panel (`/qazmlp`) for checking stats and running tests.
|
| 108 |
- **Stats**: Stored in Supabase (persisted across Vercel cold starts).
|
| 109 |
|
| 110 |
+
---
|
| 111 |
+
|
| 112 |
## 5. Debugging Tools
|
| 113 |
We have built-in tools to diagnose issues on Vercel:
|
| 114 |
- **`/admin/debug_g4f`**: Runs a live G4F test (`gpt-4o-mini`, `gpt-4`) and returns verbose logs.
|
| 115 |
- *Note*: Uses `AsyncClient` to avoid "Event loop already running" errors.
|
| 116 |
- **`/admin/test_all`**: Runs a parallel check on all configured models.
|
| 117 |
- **`debug_g4f_verbose.py`**: Local script for deep inspection.
|
| 118 |
+
- **`debug_huggingchat_visible.py`**: Launches visible browser to debug HuggingChat interactions.
|
| 119 |
+
|
| 120 |
+
---
|
| 121 |
+
|
| 122 |
+
## 6. Response Sanitization
|
| 123 |
+
The `sanitizer.py` module cleans AI responses by removing:
|
| 124 |
+
- **Promotional spam** (llmplayground.net, Pollinations ads, etc.)
|
| 125 |
+
- **UI Artifacts** ("Export to Sheets", "Copied", model names like "Kimi-K2-Instruct-0905 via groq")
|
| 126 |
+
- **JSON double-encoding** (some providers wrap responses in JSON)
|
| 127 |
+
- **Reasoning traces** (`<think>` tags from DeepSeek and similar)
|
| 128 |
+
|
| 129 |
+
**When adding new providers**, check if they inject artifacts and add patterns to `SPAM_PATTERNS` in `sanitizer.py`.
|
| 130 |
+
|
| 131 |
+
---
|
| 132 |
|
| 133 |
+
## 7. Provider Session Management (Supabase)
|
| 134 |
+
|
| 135 |
+
### Overview
|
| 136 |
+
Browser-based providers (HuggingChat, Z.ai, Gemini) can save their authentication sessions to Supabase. This ensures:
|
| 137 |
+
- ✅ Sessions survive redeploys and restarts
|
| 138 |
+
- ✅ No repeated login emails
|
| 139 |
+
- ✅ Shared session state across multiple workers
|
| 140 |
+
|
| 141 |
+
### Architecture
|
| 142 |
+
- **Table**: `provider_sessions` (see `supabase_sessions_schema.sql`)
|
| 143 |
+
- **Manager**: `provider_sessions.py` - `ProviderSessionManager` class
|
| 144 |
+
- **Key Fields**:
|
| 145 |
+
- `provider`: Provider name (e.g., "huggingchat", "zai")
|
| 146 |
+
- `session_data`: JSONB with cookies, tokens, etc.
|
| 147 |
+
- `conversation_count`: Number of API calls made
|
| 148 |
+
- `max_conversations`: Limit before requiring re-login (default 50)
|
| 149 |
+
- `expires_at`: Session expiration timestamp
|
| 150 |
+
|
| 151 |
+
### Usage in Providers
|
| 152 |
+
```python
|
| 153 |
+
from provider_sessions import get_provider_session_manager
|
| 154 |
+
|
| 155 |
+
session_mgr = get_provider_session_manager()
|
| 156 |
+
|
| 157 |
+
# Check if we need to login
|
| 158 |
+
if session_mgr.needs_login("huggingchat"):
|
| 159 |
+
# Perform login
|
| 160 |
+
cookies = await perform_login()
|
| 161 |
+
# Save to Supabase
|
| 162 |
+
session_mgr.save_session("huggingchat", cookies, conversation_count=0)
|
| 163 |
+
else:
|
| 164 |
+
# Use existing session
|
| 165 |
+
session = session_mgr.get_session("huggingchat")
|
| 166 |
+
cookies = session["session_data"]["cookies"]
|
| 167 |
+
|
| 168 |
+
# After successful API call, increment counter
|
| 169 |
+
session_mgr.increment_conversation("huggingchat")
|
| 170 |
+
```
|
| 171 |
+
|
| 172 |
+
### Setup
|
| 173 |
+
1. Run `supabase_sessions_schema.sql` in Supabase SQL Editor
|
| 174 |
+
2. Ensure `SUPABASE_URL` and `SUPABASE_KEY` are set in environment
|
| 175 |
+
3. Provider automatically uses Supabase for session persistence
|
| 176 |
+
|
| 177 |
+
### Current Implementation Status
|
| 178 |
+
- **HuggingChat**: ✅ Uses Supabase sessions (saves cookies, 50 conversations per login)
|
| 179 |
+
- **Z.ai**: ❌ Not needed (auto-gets guest JWT each time)
|
| 180 |
+
- **Gemini**: ❌ Not needed (no authentication required)
|
| 181 |
+
|
| 182 |
+
### Limits Per Provider
|
| 183 |
+
| Provider | Max Conversations | Session Duration |
|
| 184 |
+
|----------|------------------|------------------|
|
| 185 |
+
| HuggingChat | 50 | 24 hours |
|
| 186 |
+
| Z.ai | 100 | 48 hours |
|
| 187 |
+
| Gemini | 100 | 48 hours |
|
| 188 |
+
|
| 189 |
+
---
|
| 190 |
+
|
| 191 |
+
## 8. Maintenance Workflows
|
| 192 |
- **Adding Models**: Run `@.agent/workflows/update.md`.
|
| 193 |
- *Crucial*: Always run `step 3.6` (Strict Mode Verification) after updates.
|
| 194 |
- **Strict Mode Validation**: Run `python3 test_strict.py`.
|
|
|
|
| 197 |
- Previous blocker (x-signature) solved via Playwright browser automation.
|
| 198 |
- Provider: `providers/zai_provider.py`, Model: `glm-5` (Tier 1).
|
| 199 |
|
| 200 |
+
---
|
| 201 |
+
|
| 202 |
+
## 9. Common Issues & Fixes
|
| 203 |
| Error | Cause | Fix |
|
| 204 |
| :--- | :--- | :--- |
|
| 205 |
| `[Errno 30] Read-only file system` | `HOME` not set to `/tmp` | Ensure `os.environ["HOME"] = "/tmp"` is at top of `main.py`. |
|
| 206 |
| `Event loop already running` | Sync `Client` in async handler | Use `g4f.client.AsyncClient`. |
|
| 207 |
| `Add a "api_key"` | Provider requires auth | The provider (e.g. OpenRouter) is active but we have no key. Use `strict` mode to avoid it, or rely on `ApiAirforce`. |
|
| 208 |
| `Model not found: auto` | `model="auto"` passed | `engine.py` must handle `model="auto"` as `None`. |
|
| 209 |
+
| HuggingChat login emails every request | Not using session management | Ensure `provider_sessions.py` is being used and Supabase table exists. |
|
| 210 |
+
| "Start chatting" modal blocking | Welcome modal not dismissed | Provider should click the modal button before finding input. |
|
| 211 |
+
| Response contains "Copied" or model names | Sanitization missing | Add UI artifact patterns to `sanitizer.py`. |
|
| 212 |
|
| 213 |
---
|
| 214 |
+
|
| 215 |
+
## 10. Tips & Tricks
|
| 216 |
+
|
| 217 |
+
### Browser-Based Providers (Z.ai, Gemini, HuggingChat)
|
| 218 |
+
1. **Always use headless mode on servers** - Visible browser doesn't work on Hugging Face
|
| 219 |
+
2. **Handle modals** - Welcome screens block interaction, click them first
|
| 220 |
+
3. **Wait for hydration** - JavaScript-heavy sites need 2-3 seconds after page load
|
| 221 |
+
4. **Multiple selectors** - Try multiple input selectors (textarea, contenteditable, etc.)
|
| 222 |
+
5. **Check for loading states** - Spinners/loading indicators mean content isn't ready
|
| 223 |
+
6. **Use ephemeral contexts** - New context per request for isolation, but reuse cookies via Supabase
|
| 224 |
+
|
| 225 |
+
### Model Naming
|
| 226 |
+
- **Always prefix with provider name** (e.g., `huggingface-`, `gemini-`, `zai-`)
|
| 227 |
+
- **Use kebab-case** (e.g., `llama-3.3-70b`, not `Llama_3.3_70b`)
|
| 228 |
+
- **Keep it short but descriptive** (e.g., `huggingface-kimi-k2` vs `moonshotai-Kimi-K2-Instruct`)
|
| 229 |
+
|
| 230 |
+
### Adding New Providers
|
| 231 |
+
1. Create `providers/<name>_provider.py` inheriting from `BaseProvider`
|
| 232 |
+
2. Implement `send_message()`, `get_available_models()`, `is_available()`
|
| 233 |
+
3. Add models to `config.py` MODEL_RANKING and PROVIDER_MODELS
|
| 234 |
+
4. Import and register in `engine.py`
|
| 235 |
+
5. Add documentation to this guide (Section 3)
|
| 236 |
+
6. Test locally with debug script before deploying
|
| 237 |
+
7. Consider if session management (Supabase) is needed
|
| 238 |
+
|
| 239 |
+
### Testing
|
| 240 |
+
- **Always test locally first** with `python3 test_<provider>_browser.py`
|
| 241 |
+
- **Use visible browser for debugging** (`headless=False`) to see what's happening
|
| 242 |
+
- **Take screenshots** at each step to diagnose issues
|
| 243 |
+
- **Check logs** on Hugging Face Spaces for errors
|
| 244 |
+
|
| 245 |
+
---
|
| 246 |
+
|
| 247 |
+
**Last Updated**: 2026-02-14
|