Tom commited on
Commit
c7dcc92
·
1 Parent(s): 5a32df9

Deploy Phi-3-mini with ZeroGPU and 50 req/day limit

Browse files
Files changed (5) hide show
  1. DEPLOYMENT.md +154 -0
  2. README.md +49 -4
  3. app.py +123 -45
  4. requirements.txt +6 -0
  5. usage_tracker.py +75 -0
DEPLOYMENT.md ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🚀 Deployment Guide for HuggingFace Space with ZeroGPU
2
+
3
+ ## ✅ Pre-Deployment Checklist
4
+
5
+ All code is ready! Here's what's configured:
6
+
7
+ - ✅ Model: `microsoft/Phi-3-mini-4k-instruct` (3.8B params)
8
+ - ✅ ZeroGPU support: Enabled with `@spaces.GPU` decorator
9
+ - ✅ Local/Space compatibility: Auto-detects environment
10
+ - ✅ Usage tracking: 50 requests/day per user
11
+ - ✅ Requirements: All dependencies listed
12
+ - ✅ README: Updated with instructions
13
+
14
+ ## 📋 Deployment Steps
15
+
16
+ ### Step 1: Push Code to Your Space
17
+
18
+ ```bash
19
+ cd /Users/tom/code/cojournalist-data
20
+
21
+ # If not already initialized
22
+ git init
23
+ git remote add space https://huggingface.co/spaces/YOUR_USERNAME/cojournalist-data
24
+
25
+ # Or if already connected
26
+ git add .
27
+ git commit -m "Deploy Phi-3-mini with ZeroGPU and usage tracking"
28
+ git push space main
29
+ ```
30
+
31
+ ### Step 2: Configure Space Hardware
32
+
33
+ 1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/cojournalist-data`
34
+ 2. Click **Settings** (⚙️ icon in top right)
35
+ 3. Scroll to **Hardware** section
36
+ 4. Select **ZeroGPU** from dropdown
37
+ 5. Click **Save**
38
+ 6. Space will restart automatically
39
+
40
+ ### Step 3: Wait for Build
41
+
42
+ The Space will:
43
+ 1. Install dependencies (~2-3 minutes)
44
+ 2. Download Phi-3-mini model (~1-2 minutes, 7.6GB)
45
+ 3. Load model into memory (~30 seconds)
46
+ 4. Launch Gradio interface
47
+
48
+ **Total build time: ~5-7 minutes**
49
+
50
+ ### Step 4: Test Your Space
51
+
52
+ Once running, test with these queries:
53
+
54
+ 1. **English:** "Who are the parliamentarians from Zurich?"
55
+ 2. **German:** "Zeige mir aktuelle Abstimmungen zur Klimapolitik"
56
+ 3. **French:** "Qui sont les parlementaires de Zurich?"
57
+ 4. **Italian:** "Mostrami i voti recenti sulla politica climatica"
58
+
59
+ ## 🔧 Space Settings Summary
60
+
61
+ ### Hardware
62
+ - **Type:** ZeroGPU
63
+ - **Cost:** FREE (included with Team plan)
64
+ - **GPU:** Nvidia H200 (70GB VRAM)
65
+ - **Allocation:** Dynamic (only when needed)
66
+
67
+ ### Environment Variables (Optional)
68
+ If you want to configure anything:
69
+ - `HF_TOKEN`: Your HuggingFace token (for private models, not needed for Phi-3)
70
+
71
+ ## 📊 Expected Behavior
72
+
73
+ ### First Request
74
+ - Takes ~5-10 seconds (GPU allocation + inference)
75
+ - Subsequent requests faster (~2-5 seconds)
76
+
77
+ ### Rate Limiting
78
+ - 50 requests per day per user IP
79
+ - Error message shown when limit reached
80
+ - Resets daily at midnight UTC
81
+
82
+ ### Model Loading
83
+ - Happens once on Space startup
84
+ - Cached for subsequent requests
85
+ - No reload needed between requests
86
+
87
+ ## 🐛 Troubleshooting
88
+
89
+ ### "Model not loading"
90
+ - Check Space logs for errors
91
+ - Verify ZeroGPU is selected in Hardware settings
92
+ - Ensure `spaces>=0.28.0` in requirements.txt
93
+
94
+ ### "Out of memory"
95
+ - This shouldn't happen with ZeroGPU (70GB VRAM)
96
+ - If it does, contact HF support
97
+
98
+ ### "Rate limit not working"
99
+ - Usage tracker uses in-memory storage
100
+ - Resets on Space restart
101
+ - IP-based tracking (works in production)
102
+
103
+ ### "Slow inference"
104
+ - First request allocates GPU (slower)
105
+ - Subsequent requests use cached allocation
106
+ - Normal: 2-5 seconds per request
107
+
108
+ ## 💰 Cost Breakdown
109
+
110
+ - **Team Plan:** $20/user/month (you already have this)
111
+ - **ZeroGPU:** FREE (included)
112
+ - **Inference:** FREE (no API calls)
113
+ - **Storage:** FREE (model cached by HF)
114
+
115
+ **Total additional cost: $0/month** 🎉
116
+
117
+ ## 🔄 Updates & Maintenance
118
+
119
+ To update your Space:
120
+ ```bash
121
+ # Make changes to code
122
+ git add .
123
+ git commit -m "Update: description of changes"
124
+ git push space main
125
+ ```
126
+
127
+ Space will automatically rebuild and redeploy.
128
+
129
+ ## 📈 Monitoring Usage
130
+
131
+ Check your Space's metrics:
132
+ 1. Go to Space page
133
+ 2. Click "Analytics" tab
134
+ 3. View daily/weekly usage stats
135
+
136
+ ## 🎯 Next Steps After Deployment
137
+
138
+ 1. ✅ Test all 4 languages
139
+ 2. ✅ Verify tool calling works
140
+ 3. ✅ Check rate limiting
141
+ 4. ✅ Monitor performance
142
+ 5. 🔜 Adjust system prompt if needed
143
+ 6. 🔜 Fine-tune temperature/max_tokens if needed
144
+
145
+ ## 📞 Support
146
+
147
+ If you encounter issues:
148
+ - Check Space logs (Settings → Logs)
149
+ - HuggingFace Discord: https://discord.gg/huggingface
150
+ - HF Forums: https://discuss.huggingface.co/
151
+
152
+ ---
153
+
154
+ **You're ready to deploy! 🚀**
README.md CHANGED
@@ -1,13 +1,58 @@
1
  ---
2
  title: Cojournalist Data
3
- emoji: 🐨
4
- colorFrom: green
5
- colorTo: red
6
  sdk: gradio
7
  sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
- short_description: Data LLM for coJournalist
11
  ---
12
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
  title: Cojournalist Data
3
+ emoji: 🏛️
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: gradio
7
  sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
+ short_description: Swiss Parliamentary Data Chatbot with Phi-3-mini
11
  ---
12
 
13
+ # 🏛️ CoJournalist Data
14
+
15
+ A Swiss Parliamentary Data Chatbot powered by Phi-3-mini and the OpenParlData MCP server.
16
+
17
+ ## Features
18
+
19
+ - 🤖 **Phi-3-mini-4k-instruct** - Efficient 3.8B parameter model running on ZeroGPU
20
+ - 🌍 **Multilingual** - Support for English, German, French, and Italian
21
+ - 🛠️ **Tool Calling** - Intelligent query routing to parliamentary data APIs
22
+ - 🔒 **Rate Limited** - 50 requests per day per user for cost control
23
+ - ⚡ **ZeroGPU** - FREE GPU inference for PRO users
24
+
25
+ ## Space Settings Required
26
+
27
+ **IMPORTANT:** To run this Space, you need to configure the following in your HuggingFace Space settings:
28
+
29
+ ### 1. Hardware Selection
30
+ - Go to **Settings** → **Hardware**
31
+ - Select **ZeroGPU** (FREE for PRO users)
32
+ - Save changes
33
+
34
+ ### 2. Environment Variables (Optional)
35
+ If you want to use the OpenParlData API when it's available:
36
+ - Add `HF_TOKEN` with your HuggingFace token
37
+
38
+ ## Usage
39
+
40
+ Simply ask questions about Swiss parliamentary data in natural language:
41
+ - "Who are the parliamentarians from Zurich?"
42
+ - "Show me recent votes about climate policy"
43
+ - "What motions were submitted about healthcare in 2024?"
44
+
45
+ ## Architecture
46
+
47
+ - **Model:** microsoft/Phi-3-mini-4k-instruct (3.8B params)
48
+ - **GPU:** ZeroGPU (H200) with dynamic allocation
49
+ - **Framework:** Gradio + Transformers + PyTorch
50
+ - **MCP Integration:** OpenParlData server for parliamentary data
51
+
52
+ ## Cost
53
+
54
+ - **HF PRO:** $9/month (required for ZeroGPU)
55
+ - **Inference:** FREE (included with PRO subscription)
56
+ - **Total:** $9/month for unlimited usage within ZeroGPU quotas
57
+
58
  Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
app.py CHANGED
@@ -1,25 +1,62 @@
1
  """
2
  CoJournalist Data - Swiss Parliamentary Data Chatbot
3
- Powered by Mistral AI and OpenParlData MCP
4
  """
5
 
6
  import os
7
  import json
8
  import gradio as gr
9
- from huggingface_hub import InferenceClient
10
  from dotenv import load_dotenv
11
  from mcp_integration import execute_mcp_query, OpenParlDataClient
12
  import asyncio
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  # Load environment variables
15
  load_dotenv()
16
 
17
- # Initialize Hugging Face Inference Client
18
- HF_TOKEN = os.getenv("HF_TOKEN")
19
- if not HF_TOKEN:
20
- print("Warning: HF_TOKEN not found. Please set it in .env file or Hugging Face Space secrets.")
21
 
22
- client = InferenceClient(token=HF_TOKEN)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  # Available languages
25
  LANGUAGES = {
@@ -29,7 +66,7 @@ LANGUAGES = {
29
  "Italiano": "it"
30
  }
31
 
32
- # System prompt for Mistral
33
  SYSTEM_PROMPT = """You are a helpful assistant that helps users query Swiss parliamentary data.
34
 
35
  You have access to the following tools from the OpenParlData MCP server:
@@ -52,11 +89,13 @@ You have access to the following tools from the OpenParlData MCP server:
52
  6. **openparldata_search_debates** - Search debate transcripts
53
  Parameters: query, date_from, date_to, speaker_id, language, limit
54
 
 
 
55
  When a user asks a question about Swiss parliamentary data:
56
  1. Analyze what information they need
57
  2. Determine which tool(s) to use
58
  3. Extract the relevant parameters from their question
59
- 4. Respond with a JSON object containing the tool call
60
 
61
  Your response should be in this exact format:
62
  {
@@ -116,51 +155,79 @@ EXAMPLES = {
116
  }
117
 
118
 
119
- async def query_mistral_async(message: str, language: str = "en") -> dict:
120
- """Query Mistral model to interpret user intent and determine tool calls."""
121
 
122
  try:
123
- # Create messages for chat completion
124
- messages = [
125
- {"role": "system", "content": SYSTEM_PROMPT},
126
- {"role": "user", "content": f"Language: {language}\nQuestion: {message}"}
127
- ]
128
-
129
- # Call Mistral via HuggingFace Inference API
130
- response = client.chat_completion(
131
- model="mistralai/Mistral-7B-Instruct-v0.3",
132
- messages=messages,
133
- max_tokens=500,
134
- temperature=0.3
135
- )
136
-
137
- # Extract response
138
- assistant_message = response.choices[0].message.content
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
  # Try to parse as JSON
141
  try:
142
- # Clean up response (sometimes models add markdown code blocks)
143
  clean_response = assistant_message.strip()
 
 
144
  if clean_response.startswith("```json"):
145
  clean_response = clean_response[7:]
146
- if clean_response.startswith("```"):
147
  clean_response = clean_response[3:]
 
148
  if clean_response.endswith("```"):
149
  clean_response = clean_response[:-3]
 
150
  clean_response = clean_response.strip()
151
 
 
 
 
 
 
 
 
 
152
  return json.loads(clean_response)
153
  except json.JSONDecodeError:
154
  # If not valid JSON, treat as natural language response
155
  return {"response": assistant_message}
156
 
157
  except Exception as e:
158
- return {"error": f"Error querying Mistral: {str(e)}"}
159
 
160
 
161
- def query_mistral(message: str, language: str = "en") -> dict:
162
- """Synchronous wrapper for async Mistral query."""
163
- return asyncio.run(query_mistral_async(message, language))
 
 
164
 
165
 
166
  async def execute_tool_async(tool_name: str, arguments: dict, show_debug: bool) -> tuple:
@@ -185,22 +252,22 @@ def chat_response(message: str, history: list, language: str, show_debug: bool)
185
  # Get language code
186
  lang_code = LANGUAGES.get(language, "en")
187
 
188
- # Query Mistral to interpret intent
189
- mistral_response = query_mistral(message, lang_code)
190
 
191
  # Check if it's a direct response (no tool call needed)
192
- if "response" in mistral_response:
193
- return mistral_response["response"]
194
 
195
  # Check for error
196
- if "error" in mistral_response:
197
- return f"❌ {mistral_response['error']}"
198
 
199
  # Execute tool call
200
- if "tool" in mistral_response and "arguments" in mistral_response:
201
- tool_name = mistral_response["tool"]
202
- arguments = mistral_response["arguments"]
203
- explanation = mistral_response.get("explanation", "")
204
 
205
  # Ensure language is set in arguments
206
  if "language" not in arguments:
@@ -315,10 +382,19 @@ with gr.Blocks(css=custom_css, title="CoJournalist Data") as demo:
315
  )
316
 
317
  # Handle message submission
318
- def respond(message, chat_history, language, show_debug):
319
  if not message.strip():
320
  return "", chat_history
321
 
 
 
 
 
 
 
 
 
 
322
  # Get bot response
323
  bot_message = chat_response(message, chat_history, language, show_debug)
324
 
@@ -341,7 +417,9 @@ with gr.Blocks(css=custom_css, title="CoJournalist Data") as demo:
341
  **Note:** This app uses the OpenParlData MCP server to access Swiss parliamentary data.
342
  Currently returning mock data while the OpenParlData API is in development.
343
 
344
- Powered by [Mistral AI](https://mistral.ai/) and [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)
 
 
345
  """
346
  )
347
 
 
1
  """
2
  CoJournalist Data - Swiss Parliamentary Data Chatbot
3
+ Powered by Phi-3-mini and OpenParlData MCP
4
  """
5
 
6
  import os
7
  import json
8
  import gradio as gr
 
9
  from dotenv import load_dotenv
10
  from mcp_integration import execute_mcp_query, OpenParlDataClient
11
  import asyncio
12
+ from usage_tracker import UsageTracker
13
+ import torch
14
+ from transformers import AutoModelForCausalLM, AutoTokenizer
15
+
16
+ # Import spaces only if available (for HuggingFace Spaces)
17
+ try:
18
+ import spaces
19
+ SPACES_AVAILABLE = True
20
+ except ImportError:
21
+ SPACES_AVAILABLE = False
22
+ print("Running locally without ZeroGPU support")
23
 
24
  # Load environment variables
25
  load_dotenv()
26
 
27
+ # Initialize usage tracker with 50 requests per day limit
28
+ tracker = UsageTracker(daily_limit=50)
 
 
29
 
30
+ # Initialize model and tokenizer
31
+ MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
32
+ print(f"Loading model: {MODEL_NAME}")
33
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
34
+
35
+ # Detect device (MPS for Mac, CUDA for GPU, CPU fallback)
36
+ if torch.cuda.is_available():
37
+ device = "cuda"
38
+ dtype = torch.float16
39
+ elif hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
40
+ device = "mps"
41
+ dtype = torch.float16
42
+ else:
43
+ device = "cpu"
44
+ dtype = torch.float32
45
+
46
+ print(f"Using device: {device}")
47
+
48
+ model = AutoModelForCausalLM.from_pretrained(
49
+ MODEL_NAME,
50
+ torch_dtype=dtype,
51
+ device_map=device if device != "mps" else None,
52
+ trust_remote_code=True
53
+ )
54
+
55
+ # Move to MPS if needed
56
+ if device == "mps":
57
+ model = model.to(device)
58
+
59
+ print(f"Model loaded successfully on {device}!")
60
 
61
  # Available languages
62
  LANGUAGES = {
 
66
  "Italiano": "it"
67
  }
68
 
69
+ # System prompt optimized for Phi-3-mini-4k-instruct
70
  SYSTEM_PROMPT = """You are a helpful assistant that helps users query Swiss parliamentary data.
71
 
72
  You have access to the following tools from the OpenParlData MCP server:
 
89
  6. **openparldata_search_debates** - Search debate transcripts
90
  Parameters: query, date_from, date_to, speaker_id, language, limit
91
 
92
+ IMPORTANT: Your response MUST be valid JSON only. Do not include any explanatory text before or after the JSON. Do not wrap your response in code blocks or markdown formatting.
93
+
94
  When a user asks a question about Swiss parliamentary data:
95
  1. Analyze what information they need
96
  2. Determine which tool(s) to use
97
  3. Extract the relevant parameters from their question
98
+ 4. Respond with ONLY a JSON object containing the tool call
99
 
100
  Your response should be in this exact format:
101
  {
 
155
  }
156
 
157
 
158
+ def query_model_impl(message: str, language: str = "en") -> dict:
159
+ """Query Phi-3-mini model to interpret user intent and determine tool calls."""
160
 
161
  try:
162
+ # Format prompt for Phi-3
163
+ prompt = f"""<|system|>
164
+ {SYSTEM_PROMPT}<|end|>
165
+ <|user|>
166
+ Language: {language}
167
+ Question: {message}<|end|>
168
+ <|assistant|>
169
+ """
170
+
171
+ # Tokenize and generate
172
+ inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=3072)
173
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
174
+
175
+ with torch.no_grad():
176
+ outputs = model.generate(
177
+ **inputs,
178
+ max_new_tokens=500,
179
+ temperature=0.3,
180
+ do_sample=True,
181
+ pad_token_id=tokenizer.eos_token_id
182
+ )
183
+
184
+ # Decode response
185
+ full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)
186
+
187
+ # Extract only the assistant's response (after the last <|assistant|>)
188
+ if "<|assistant|>" in full_response:
189
+ assistant_message = full_response.split("<|assistant|>")[-1].strip()
190
+ else:
191
+ assistant_message = full_response.strip()
192
 
193
  # Try to parse as JSON
194
  try:
195
+ # Clean up response - enhanced for Phi-3 model
196
  clean_response = assistant_message.strip()
197
+
198
+ # Remove markdown code blocks
199
  if clean_response.startswith("```json"):
200
  clean_response = clean_response[7:]
201
+ elif clean_response.startswith("```"):
202
  clean_response = clean_response[3:]
203
+
204
  if clean_response.endswith("```"):
205
  clean_response = clean_response[:-3]
206
+
207
  clean_response = clean_response.strip()
208
 
209
+ # Find first { or [ (start of JSON) to handle explanatory text
210
+ json_start = min(
211
+ clean_response.find('{') if '{' in clean_response else len(clean_response),
212
+ clean_response.find('[') if '[' in clean_response else len(clean_response)
213
+ )
214
+ if json_start > 0:
215
+ clean_response = clean_response[json_start:]
216
+
217
  return json.loads(clean_response)
218
  except json.JSONDecodeError:
219
  # If not valid JSON, treat as natural language response
220
  return {"response": assistant_message}
221
 
222
  except Exception as e:
223
+ return {"error": f"Error querying model: {str(e)}"}
224
 
225
 
226
+ # Apply ZeroGPU decorator only when running on HuggingFace Spaces
227
+ if SPACES_AVAILABLE:
228
+ query_model = spaces.GPU(duration=60)(query_model_impl)
229
+ else:
230
+ query_model = query_model_impl
231
 
232
 
233
  async def execute_tool_async(tool_name: str, arguments: dict, show_debug: bool) -> tuple:
 
252
  # Get language code
253
  lang_code = LANGUAGES.get(language, "en")
254
 
255
+ # Query Phi-3 model to interpret intent
256
+ model_response = query_model(message, lang_code)
257
 
258
  # Check if it's a direct response (no tool call needed)
259
+ if "response" in model_response:
260
+ return model_response["response"]
261
 
262
  # Check for error
263
+ if "error" in model_response:
264
+ return f"❌ {model_response['error']}"
265
 
266
  # Execute tool call
267
+ if "tool" in model_response and "arguments" in model_response:
268
+ tool_name = model_response["tool"]
269
+ arguments = model_response["arguments"]
270
+ explanation = model_response.get("explanation", "")
271
 
272
  # Ensure language is set in arguments
273
  if "language" not in arguments:
 
382
  )
383
 
384
  # Handle message submission
385
+ def respond(message, chat_history, language, show_debug, request: gr.Request):
386
  if not message.strip():
387
  return "", chat_history
388
 
389
+ # Check usage limit
390
+ user_id = request.client.host if request and hasattr(request, 'client') else "unknown"
391
+
392
+ if not tracker.check_limit(user_id):
393
+ remaining = tracker.get_remaining(user_id)
394
+ bot_message = f"⚠️ Daily request limit reached. You have used all 50 requests for today. Please try again tomorrow.\n\nThis limit helps us keep the service free and available for everyone."
395
+ chat_history.append((message, bot_message))
396
+ return "", chat_history
397
+
398
  # Get bot response
399
  bot_message = chat_response(message, chat_history, language, show_debug)
400
 
 
417
  **Note:** This app uses the OpenParlData MCP server to access Swiss parliamentary data.
418
  Currently returning mock data while the OpenParlData API is in development.
419
 
420
+ **Rate Limit:** 50 requests per day per user to keep the service free and accessible.
421
+
422
+ Powered by [Phi-3-mini](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) on ZeroGPU and [Model Context Protocol (MCP)](https://modelcontextprotocol.io/)
423
  """
424
  )
425
 
requirements.txt CHANGED
@@ -6,6 +6,12 @@ gradio>=5.49.1
6
 
7
  # Hugging Face
8
  huggingface-hub>=0.22.0
 
 
 
 
 
 
9
 
10
  # MCP Support
11
  mcp>=0.1.0
 
6
 
7
  # Hugging Face
8
  huggingface-hub>=0.22.0
9
+ transformers>=4.40.0
10
+ torch>=2.0.0
11
+ accelerate>=0.20.0
12
+
13
+ # ZeroGPU support (required for HuggingFace Spaces deployment)
14
+ spaces>=0.28.0
15
 
16
  # MCP Support
17
  mcp>=0.1.0
usage_tracker.py ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Usage tracking module for rate limiting API requests.
3
+
4
+ This module provides a simple in-memory usage tracker that limits
5
+ the number of requests per user per day.
6
+ """
7
+
8
+ from datetime import datetime
9
+ from typing import Dict
10
+
11
+
12
+ class UsageTracker:
13
+ """Track and limit user requests on a daily basis."""
14
+
15
+ def __init__(self, daily_limit: int = 100):
16
+ """
17
+ Initialize the usage tracker.
18
+
19
+ Args:
20
+ daily_limit: Maximum number of requests per user per day
21
+ """
22
+ self.daily_limit = daily_limit
23
+ self.usage: Dict[datetime.date, Dict[str, int]] = {}
24
+
25
+ def check_limit(self, user_id: str) -> bool:
26
+ """
27
+ Check if user has exceeded their daily limit and increment counter.
28
+
29
+ Args:
30
+ user_id: Unique identifier for the user (typically IP address)
31
+
32
+ Returns:
33
+ True if request is allowed, False if limit exceeded
34
+ """
35
+ today = datetime.now().date()
36
+
37
+ # Clean up old dates to prevent memory growth
38
+ if today not in self.usage:
39
+ self.usage = {today: {}}
40
+
41
+ # Get current usage count for this user
42
+ user_count = self.usage[today].get(user_id, 0)
43
+
44
+ # Check if limit exceeded
45
+ if user_count >= self.daily_limit:
46
+ return False
47
+
48
+ # Increment counter
49
+ self.usage[today][user_id] = user_count + 1
50
+ return True
51
+
52
+ def get_usage(self, user_id: str) -> int:
53
+ """
54
+ Get current usage count for a user today.
55
+
56
+ Args:
57
+ user_id: Unique identifier for the user
58
+
59
+ Returns:
60
+ Number of requests made today
61
+ """
62
+ today = datetime.now().date()
63
+ return self.usage.get(today, {}).get(user_id, 0)
64
+
65
+ def get_remaining(self, user_id: str) -> int:
66
+ """
67
+ Get remaining requests for a user today.
68
+
69
+ Args:
70
+ user_id: Unique identifier for the user
71
+
72
+ Returns:
73
+ Number of requests remaining today
74
+ """
75
+ return max(0, self.daily_limit - self.get_usage(user_id))