KiWA001 commited on
Commit
7c7bb36
·
1 Parent(s): 2e65f0f

Update KAIGUIDE.md with comprehensive documentation

Browse files

- Add Section 6: Response Sanitization (spam patterns, UI artifacts)
- Add Section 7: Provider Session Management via Supabase
* Architecture and usage patterns
* Setup instructions
* Implementation status per provider
* Session limits table
- Add Section 9: Common Issues (login emails, modals, sanitization)
- Add Section 10: Tips & Tricks
* Browser provider best practices
* Model naming conventions
* Steps for adding new providers
* Testing recommendations
- Updated provider sections with current status
- Last Updated: 2026-02-14

Files changed (1) hide show
  1. KAIGUIDE.md +137 -17
KAIGUIDE.md CHANGED
@@ -5,18 +5,18 @@
5
  ---
6
 
7
  ## 1. System Architecture
8
- The API uses a **Strict Engine** (`engine.py`) that routes requests to providers (`g4f`, `pollinations`).
9
  - **Adaptive Fallback**: By default (`provider="auto"`), the engine tries models in `MODEL_RANKING` order.
10
  - **Strict Mode**: If `model` or `provider` is specified, the engine tries **ONLY** that combination. No fallback.
11
 
12
  ## 2. Deployment
13
 
14
  ### A. Vercel (Serverless - Default)
15
- Fast, free, but NO browser support (**Z.ai disabled**).
16
  See `vercel.json` and hacks below.
17
 
18
- ### B. Hugging Face Spaces (Docker - Z.ai Online)
19
- Use this for **Z.ai** (requires browser).
20
  - See [README_DOCKER.md](file:///Users/mac/KAI_API/README_DOCKER.md)
21
  - Supports full browser automation.
22
 
@@ -35,6 +35,8 @@ if os.environ.get("VERCEL") or True:
35
  ```
36
  **DO NOT REMOVE THIS.** It prevents `[Errno 30] Read-only file system` crashes.
37
 
 
 
38
  ## 3. Provider Specifics
39
 
40
  ### A. G4F (Scraping Layer)
@@ -56,7 +58,7 @@ Uses Playwright Chromium to interact with `chat.z.ai` as a real browser.
56
  - **Model**: `glm-5` (default, reasoning model), `glm-4-flash`.
57
  - **Key Headers**: `x-fe-version: prod-fe-1.0.237`, `x-signature: <sha256>`, `Authorization: Bearer <JWT>`.
58
  - **Speed**: ~5-15s per request (browser startup + DOM scraping).
59
- - **Vercel**: **DISABLED** (no Chromium in serverless). Local only.
60
  - **Files**: `providers/zai_provider.py`, `test_zai_browser.py`, `zai_captured.json`.
61
 
62
  ### D. Gemini (Browser-Based Provider)
@@ -71,18 +73,22 @@ Uses Playwright Chromium to interact with `gemini.google.com` as a real browser.
71
  ### E. HuggingChat (Browser-Based Provider)
72
  Uses Playwright Chromium to interact with `huggingface.co/chat` as a real browser.
73
  - **Why Browser**: HuggingChat provides access to 100+ open-source models via web interface.
 
 
 
74
  - **Input**: `textarea` with placeholder text.
75
  - **Features**:
76
  - Handles the welcome modal automatically (clicks "Start chatting")
77
- - Supports model selection from dropdown (optional, defaults to "Omni" router)
78
- - Access to top models: Llama 3.3 70B, Qwen 2.5 72B, DeepSeek R1, etc.
79
- - **Models**:
80
- - `omni` - Auto-routes to best model (default)
81
- - `meta-llama/Llama-3.3-70B-Instruct` - Meta's latest Llama model
82
- - `Qwen/Qwen2.5-72B-Instruct` - Alibaba's Qwen model
83
- - `deepseek-ai/DeepSeek-R1` - DeepSeek reasoning model
84
- - **Files**: `providers/huggingchat_provider.py`.
85
- - **Status**: **Experimental**. Requires local Playwright environment.
 
86
  - **Vercel**: **DISABLED** (no Chromium in serverless). Local/Docker only.
87
 
88
  ### F. Search & Deep Research
@@ -94,19 +100,95 @@ The API includes a search engine (`search_engine.py`) powered by DuckDuckGo (via
94
  3. Scrapes results.
95
  4. Synthesizes a final answer using the AI Engine.
96
 
 
 
97
  ## 4. Frontend & Admin
98
  - **`static/docs.html`**: The public landing page AND the "Try It" dashboard.
99
  - **`static/admin.html`**: Secret admin panel (`/qazmlp`) for checking stats and running tests.
100
  - **Stats**: Stored in Supabase (persisted across Vercel cold starts).
101
 
 
 
102
  ## 5. Debugging Tools
103
  We have built-in tools to diagnose issues on Vercel:
104
  - **`/admin/debug_g4f`**: Runs a live G4F test (`gpt-4o-mini`, `gpt-4`) and returns verbose logs.
105
  - *Note*: Uses `AsyncClient` to avoid "Event loop already running" errors.
106
  - **`/admin/test_all`**: Runs a parallel check on all configured models.
107
  - **`debug_g4f_verbose.py`**: Local script for deep inspection.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
 
109
- ## 5. Maintenance Workflows
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
  - **Adding Models**: Run `@.agent/workflows/update.md`.
111
  - *Crucial*: Always run `step 3.6` (Strict Mode Verification) after updates.
112
  - **Strict Mode Validation**: Run `python3 test_strict.py`.
@@ -115,13 +197,51 @@ We have built-in tools to diagnose issues on Vercel:
115
  - Previous blocker (x-signature) solved via Playwright browser automation.
116
  - Provider: `providers/zai_provider.py`, Model: `glm-5` (Tier 1).
117
 
118
- ## 6. Common Issues & Fixes
 
 
119
  | Error | Cause | Fix |
120
  | :--- | :--- | :--- |
121
  | `[Errno 30] Read-only file system` | `HOME` not set to `/tmp` | Ensure `os.environ["HOME"] = "/tmp"` is at top of `main.py`. |
122
  | `Event loop already running` | Sync `Client` in async handler | Use `g4f.client.AsyncClient`. |
123
  | `Add a "api_key"` | Provider requires auth | The provider (e.g. OpenRouter) is active but we have no key. Use `strict` mode to avoid it, or rely on `ApiAirforce`. |
124
  | `Model not found: auto` | `model="auto"` passed | `engine.py` must handle `model="auto"` as `None`. |
 
 
 
125
 
126
  ---
127
- **Last Updated**: 2026-02-12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
6
 
7
  ## 1. System Architecture
8
+ The API uses a **Strict Engine** (`engine.py`) that routes requests to providers (`g4f`, `pollinations`, `zai`, `gemini`, `huggingchat`).
9
  - **Adaptive Fallback**: By default (`provider="auto"`), the engine tries models in `MODEL_RANKING` order.
10
  - **Strict Mode**: If `model` or `provider` is specified, the engine tries **ONLY** that combination. No fallback.
11
 
12
  ## 2. Deployment
13
 
14
  ### A. Vercel (Serverless - Default)
15
+ Fast, free, but NO browser support (**Z.ai, Gemini, HuggingChat disabled**).
16
  See `vercel.json` and hacks below.
17
 
18
+ ### B. Hugging Face Spaces (Docker - Full Browser Support)
19
+ Use this for **Z.ai, Gemini, HuggingChat** (requires browser).
20
  - See [README_DOCKER.md](file:///Users/mac/KAI_API/README_DOCKER.md)
21
  - Supports full browser automation.
22
 
 
35
  ```
36
  **DO NOT REMOVE THIS.** It prevents `[Errno 30] Read-only file system` crashes.
37
 
38
+ ---
39
+
40
  ## 3. Provider Specifics
41
 
42
  ### A. G4F (Scraping Layer)
 
58
  - **Model**: `glm-5` (default, reasoning model), `glm-4-flash`.
59
  - **Key Headers**: `x-fe-version: prod-fe-1.0.237`, `x-signature: <sha256>`, `Authorization: Bearer <JWT>`.
60
  - **Speed**: ~5-15s per request (browser startup + DOM scraping).
61
+ - **Vercel**: **DISABLED** (no Chromium in serverless). Local/Docker only.
62
  - **Files**: `providers/zai_provider.py`, `test_zai_browser.py`, `zai_captured.json`.
63
 
64
  ### D. Gemini (Browser-Based Provider)
 
73
  ### E. HuggingChat (Browser-Based Provider)
74
  Uses Playwright Chromium to interact with `huggingface.co/chat` as a real browser.
75
  - **Why Browser**: HuggingChat provides access to 100+ open-source models via web interface.
76
+ - **Authentication**: Uses credentials (stored in provider) - logs in automatically.
77
+ - **Session Management**: Uses Supabase to persist cookies across redeploys (see Section 7).
78
+ - **New Conversation Each Time**: Clicks "New Chat" to ensure no context sharing between API calls.
79
  - **Input**: `textarea` with placeholder text.
80
  - **Features**:
81
  - Handles the welcome modal automatically (clicks "Start chatting")
82
+ - Supports model selection from dropdown
83
+ - Access to top models: Llama 3.3 70B, Qwen 2.5 72B, DeepSeek R1, Kimi K2, etc.
84
+ - **Models** (all prefixed with `huggingface-`):
85
+ - `huggingface-omni` - Auto-routes to best model (default)
86
+ - `huggingface-llama-3.3-70b` - Meta's latest Llama model
87
+ - `huggingface-qwen-72b` - Alibaba's Qwen model
88
+ - `huggingface-deepseek-r1` - DeepSeek reasoning model
89
+ - `huggingface-kimi-k2` - Moonshot's Kimi K2 model
90
+ - **Files**: `providers/huggingchat_provider.py`, `provider_sessions.py`.
91
+ - **Status**: **Working**. Requires local Playwright environment.
92
  - **Vercel**: **DISABLED** (no Chromium in serverless). Local/Docker only.
93
 
94
  ### F. Search & Deep Research
 
100
  3. Scrapes results.
101
  4. Synthesizes a final answer using the AI Engine.
102
 
103
+ ---
104
+
105
  ## 4. Frontend & Admin
106
  - **`static/docs.html`**: The public landing page AND the "Try It" dashboard.
107
  - **`static/admin.html`**: Secret admin panel (`/qazmlp`) for checking stats and running tests.
108
  - **Stats**: Stored in Supabase (persisted across Vercel cold starts).
109
 
110
+ ---
111
+
112
  ## 5. Debugging Tools
113
  We have built-in tools to diagnose issues on Vercel:
114
  - **`/admin/debug_g4f`**: Runs a live G4F test (`gpt-4o-mini`, `gpt-4`) and returns verbose logs.
115
  - *Note*: Uses `AsyncClient` to avoid "Event loop already running" errors.
116
  - **`/admin/test_all`**: Runs a parallel check on all configured models.
117
  - **`debug_g4f_verbose.py`**: Local script for deep inspection.
118
+ - **`debug_huggingchat_visible.py`**: Launches visible browser to debug HuggingChat interactions.
119
+
120
+ ---
121
+
122
+ ## 6. Response Sanitization
123
+ The `sanitizer.py` module cleans AI responses by removing:
124
+ - **Promotional spam** (llmplayground.net, Pollinations ads, etc.)
125
+ - **UI Artifacts** ("Export to Sheets", "Copied", model names like "Kimi-K2-Instruct-0905 via groq")
126
+ - **JSON double-encoding** (some providers wrap responses in JSON)
127
+ - **Reasoning traces** (`<think>` tags from DeepSeek and similar)
128
+
129
+ **When adding new providers**, check if they inject artifacts and add patterns to `SPAM_PATTERNS` in `sanitizer.py`.
130
+
131
+ ---
132
 
133
+ ## 7. Provider Session Management (Supabase)
134
+
135
+ ### Overview
136
+ Browser-based providers (HuggingChat, Z.ai, Gemini) can save their authentication sessions to Supabase. This ensures:
137
+ - ✅ Sessions survive redeploys and restarts
138
+ - ✅ No repeated login emails
139
+ - ✅ Shared session state across multiple workers
140
+
141
+ ### Architecture
142
+ - **Table**: `provider_sessions` (see `supabase_sessions_schema.sql`)
143
+ - **Manager**: `provider_sessions.py` - `ProviderSessionManager` class
144
+ - **Key Fields**:
145
+ - `provider`: Provider name (e.g., "huggingchat", "zai")
146
+ - `session_data`: JSONB with cookies, tokens, etc.
147
+ - `conversation_count`: Number of API calls made
148
+ - `max_conversations`: Limit before requiring re-login (default 50)
149
+ - `expires_at`: Session expiration timestamp
150
+
151
+ ### Usage in Providers
152
+ ```python
153
+ from provider_sessions import get_provider_session_manager
154
+
155
+ session_mgr = get_provider_session_manager()
156
+
157
+ # Check if we need to login
158
+ if session_mgr.needs_login("huggingchat"):
159
+ # Perform login
160
+ cookies = await perform_login()
161
+ # Save to Supabase
162
+ session_mgr.save_session("huggingchat", cookies, conversation_count=0)
163
+ else:
164
+ # Use existing session
165
+ session = session_mgr.get_session("huggingchat")
166
+ cookies = session["session_data"]["cookies"]
167
+
168
+ # After successful API call, increment counter
169
+ session_mgr.increment_conversation("huggingchat")
170
+ ```
171
+
172
+ ### Setup
173
+ 1. Run `supabase_sessions_schema.sql` in Supabase SQL Editor
174
+ 2. Ensure `SUPABASE_URL` and `SUPABASE_KEY` are set in environment
175
+ 3. Provider automatically uses Supabase for session persistence
176
+
177
+ ### Current Implementation Status
178
+ - **HuggingChat**: ✅ Uses Supabase sessions (saves cookies, 50 conversations per login)
179
+ - **Z.ai**: ❌ Not needed (auto-gets guest JWT each time)
180
+ - **Gemini**: ❌ Not needed (no authentication required)
181
+
182
+ ### Limits Per Provider
183
+ | Provider | Max Conversations | Session Duration |
184
+ |----------|------------------|------------------|
185
+ | HuggingChat | 50 | 24 hours |
186
+ | Z.ai | 100 | 48 hours |
187
+ | Gemini | 100 | 48 hours |
188
+
189
+ ---
190
+
191
+ ## 8. Maintenance Workflows
192
  - **Adding Models**: Run `@.agent/workflows/update.md`.
193
  - *Crucial*: Always run `step 3.6` (Strict Mode Verification) after updates.
194
  - **Strict Mode Validation**: Run `python3 test_strict.py`.
 
197
  - Previous blocker (x-signature) solved via Playwright browser automation.
198
  - Provider: `providers/zai_provider.py`, Model: `glm-5` (Tier 1).
199
 
200
+ ---
201
+
202
+ ## 9. Common Issues & Fixes
203
  | Error | Cause | Fix |
204
  | :--- | :--- | :--- |
205
  | `[Errno 30] Read-only file system` | `HOME` not set to `/tmp` | Ensure `os.environ["HOME"] = "/tmp"` is at top of `main.py`. |
206
  | `Event loop already running` | Sync `Client` in async handler | Use `g4f.client.AsyncClient`. |
207
  | `Add a "api_key"` | Provider requires auth | The provider (e.g. OpenRouter) is active but we have no key. Use `strict` mode to avoid it, or rely on `ApiAirforce`. |
208
  | `Model not found: auto` | `model="auto"` passed | `engine.py` must handle `model="auto"` as `None`. |
209
+ | HuggingChat login emails every request | Not using session management | Ensure `provider_sessions.py` is being used and Supabase table exists. |
210
+ | "Start chatting" modal blocking | Welcome modal not dismissed | Provider should click the modal button before finding input. |
211
+ | Response contains "Copied" or model names | Sanitization missing | Add UI artifact patterns to `sanitizer.py`. |
212
 
213
  ---
214
+
215
+ ## 10. Tips & Tricks
216
+
217
+ ### Browser-Based Providers (Z.ai, Gemini, HuggingChat)
218
+ 1. **Always use headless mode on servers** - Visible browser doesn't work on Hugging Face
219
+ 2. **Handle modals** - Welcome screens block interaction, click them first
220
+ 3. **Wait for hydration** - JavaScript-heavy sites need 2-3 seconds after page load
221
+ 4. **Multiple selectors** - Try multiple input selectors (textarea, contenteditable, etc.)
222
+ 5. **Check for loading states** - Spinners/loading indicators mean content isn't ready
223
+ 6. **Use ephemeral contexts** - New context per request for isolation, but reuse cookies via Supabase
224
+
225
+ ### Model Naming
226
+ - **Always prefix with provider name** (e.g., `huggingface-`, `gemini-`, `zai-`)
227
+ - **Use kebab-case** (e.g., `llama-3.3-70b`, not `Llama_3.3_70b`)
228
+ - **Keep it short but descriptive** (e.g., `huggingface-kimi-k2` vs `moonshotai-Kimi-K2-Instruct`)
229
+
230
+ ### Adding New Providers
231
+ 1. Create `providers/<name>_provider.py` inheriting from `BaseProvider`
232
+ 2. Implement `send_message()`, `get_available_models()`, `is_available()`
233
+ 3. Add models to `config.py` MODEL_RANKING and PROVIDER_MODELS
234
+ 4. Import and register in `engine.py`
235
+ 5. Add documentation to this guide (Section 3)
236
+ 6. Test locally with debug script before deploying
237
+ 7. Consider if session management (Supabase) is needed
238
+
239
+ ### Testing
240
+ - **Always test locally first** with `python3 test_<provider>_browser.py`
241
+ - **Use visible browser for debugging** (`headless=False`) to see what's happening
242
+ - **Take screenshots** at each step to diagnose issues
243
+ - **Check logs** on Hugging Face Spaces for errors
244
+
245
+ ---
246
+
247
+ **Last Updated**: 2026-02-14