File size: 13,735 Bytes
f2b4e49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
# DeepCritical: Medical Drug Repurposing Research Agent
## Project Overview

---

## Executive Summary

**DeepCritical** is a deep research agent designed to accelerate medical drug repurposing research by autonomously searching, analyzing, and synthesizing evidence from multiple biomedical databases.

### The Problem We Solve

Drug repurposing - finding new therapeutic uses for existing FDA-approved drugs - can take years of manual literature review. Researchers must:
- Search thousands of papers across multiple databases
- Identify molecular mechanisms
- Find relevant clinical trials
- Assess safety profiles
- Synthesize evidence into actionable insights

**DeepCritical automates this process from hours to minutes.**

### What Is Drug Repurposing?

**Simple Explanation:**
Using existing approved drugs to treat NEW diseases they weren't originally designed for.

**Real Examples:**
- **Viagra** (sildenafil): Originally for heart disease β†’ Now treats erectile dysfunction
- **Thalidomide**: Once banned β†’ Now treats multiple myeloma
- **Aspirin**: Pain reliever β†’ Heart attack prevention
- **Metformin**: Diabetes drug β†’ Being tested for aging/longevity

**Why It Matters:**
- Faster than developing new drugs (years vs decades)
- Cheaper (known safety profiles)
- Lower risk (already FDA approved)
- Immediate patient benefit potential

---

## Core Use Case

### Primary Query Type
> "What existing drugs might help treat [disease/condition]?"

### Example Queries

1. **Long COVID Fatigue**
   - Query: "What existing drugs might help treat long COVID fatigue?"
   - Agent searches: PubMed, clinical trials, drug databases
   - Output: List of candidate drugs with mechanisms + evidence + citations

2. **Alzheimer's Disease**
   - Query: "Find existing drugs that target beta-amyloid pathways"
   - Agent identifies: Disease mechanisms β†’ Drug candidates β†’ Clinical evidence
   - Output: Comprehensive research report with drug candidates

3. **Rare Disease Treatment**
   - Query: "What drugs might help with fibrodysplasia ossificans progressiva?"
   - Agent finds: Similar conditions β†’ Shared pathways β†’ Potential treatments
   - Output: Evidence-based treatment suggestions

---

## System Architecture

### High-Level Design

```
User Question
    ↓
Research Agent (Orchestrator)
    ↓
Search Loop:
  1. Query Tools (PubMed, Web, Clinical Trials)
  2. Gather Evidence
  3. Judge Quality ("Do we have enough?")
  4. If NO β†’ Refine query, search more
  5. If YES β†’ Synthesize findings
    ↓
Research Report with Citations
```

### Key Components

1. **Research Agent (Orchestrator)**
   - Manages the research process
   - Plans search strategies
   - Coordinates tools
   - Tracks token budget and iterations

2. **Tools**
   - PubMed Search (biomedical papers)
   - Web Search (general medical info)
   - Clinical Trials Database
   - Drug Information APIs
   - (Future: Protein databases, pathways)

3. **Judge System**
   - LLM-based quality assessment
   - Evaluates: "Do we have enough evidence?"
   - Criteria: Coverage, reliability, citation quality

4. **Break Conditions**
   - Token budget cap (cost control)
   - Max iterations (time control)
   - Judge says "sufficient evidence" (quality control)

5. **Gradio UI**
   - Simple text input for questions
   - Real-time progress display
   - Formatted research report output
   - Source citations and links

---

## Design Patterns

### 1. Search-and-Judge Loop (Primary Pattern)

```python
def research(question: str) -> Report:
    context = []
    for iteration in range(max_iterations):
        # SEARCH: Query relevant tools
        results = search_tools(question, context)
        context.extend(results)

        # JUDGE: Evaluate quality
        if judge.is_sufficient(question, context):
            break

        # REFINE: Adjust search strategy
        query = refine_query(question, context)

    # SYNTHESIZE: Generate report
    return synthesize_report(question, context)
```

**Why This Pattern:**
- Simple to implement and debug
- Clear loop termination conditions
- Iterative improvement of search quality
- Balances depth vs speed

### 2. Multi-Tool Orchestration

```
Question β†’ Agent decides which tools to use
           ↓
       β”Œβ”€β”€β”€β”΄β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       ↓        ↓         ↓          ↓
   PubMed  Web Search  Trials DB  Drug DB
       ↓        ↓         ↓          ↓
       β””β”€β”€β”€β”¬β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
           ↓
    Aggregate Results β†’ Judge
```

**Why This Pattern:**
- Different sources provide different evidence types
- Parallel tool execution (when possible)
- Comprehensive coverage

### 3. LLM-as-Judge with Token Budget

**Dual Stopping Conditions:**
- **Smart Stop**: LLM judge says "we have sufficient evidence"
- **Hard Stop**: Token budget exhausted OR max iterations reached

**Why Both:**
- Judge enables early exit when answer is good
- Budget prevents runaway costs
- Iterations prevent infinite loops

### 4. Stateful Checkpointing

```
.deepresearch/
β”œβ”€β”€ state/
β”‚   └── query_123.json    # Current research state
β”œβ”€β”€ checkpoints/
β”‚   └── query_123_iter3/  # Checkpoint at iteration 3
└── workspace/
    └── query_123/        # Downloaded papers, data
```

**Why This Pattern:**
- Resume interrupted research
- Debugging and analysis
- Cost savings (don't re-search)

---

## Component Breakdown

### Agent (Orchestrator)
- **Responsibility**: Coordinate research process
- **Size**: ~100 lines
- **Key Methods**:
  - `research(question)` - Main entry point
  - `plan_search_strategy()` - Decide what to search
  - `execute_search()` - Run tool queries
  - `evaluate_progress()` - Call judge
  - `synthesize_findings()` - Generate report

### Tools
- **Responsibility**: Interface with external data sources
- **Size**: ~50 lines per tool
- **Implementations**:
  - `PubMedTool` - Search biomedical literature
  - `WebSearchTool` - General medical information
  - `ClinicalTrialsTool` - Trial data (optional)
  - `DrugInfoTool` - FDA drug database (optional)

### Judge
- **Responsibility**: Evaluate evidence quality
- **Size**: ~50 lines
- **Key Methods**:
  - `is_sufficient(question, evidence)` β†’ bool
  - `assess_quality(evidence)` β†’ score
  - `identify_gaps(question, evidence)` β†’ missing_info

### Gradio App
- **Responsibility**: User interface
- **Size**: ~50 lines
- **Features**:
  - Text input for questions
  - Progress indicators
  - Formatted output with citations
  - Download research report

---

## Technical Stack

### Core Dependencies
```toml
[dependencies]
python = ">=3.10"
pydantic = "^2.7"
pydantic-ai = "^0.0.16"
fastmcp = "^0.1.0"
gradio = "^5.0"
beautifulsoup4 = "^4.12"
httpx = "^0.27"
```

### Optional Enhancements
- `modal` - For GPU-accelerated local LLM
- `fastmcp` - MCP server integration
- `sentence-transformers` - Semantic search
- `faiss-cpu` - Vector similarity

### Tool APIs & Rate Limits

| API | Cost | Rate Limit | API Key? | Notes |
|-----|------|------------|----------|-------|
| **PubMed E-utilities** | Free | 3/sec (no key), 10/sec (with key) | Optional | Register at NCBI for higher limits |
| **Brave Search API** | Free tier | 2000/month free | Required | Primary web search |
| **DuckDuckGo** | Free | Unofficial, ~1/sec | No | Fallback web search |
| **ClinicalTrials.gov** | Free | 100/min | No | Stretch goal |
| **OpenFDA** | Free | 240/min (no key), 120K/day (with key) | Optional | Drug info |

**Web Search Strategy (Priority Order):**
1. **Brave Search API** (free tier: 2000 queries/month) - Primary
2. **DuckDuckGo** (unofficial, no API key) - Fallback
3. **SerpAPI** ($50/month) - Only if free options fail

**Why NOT SerpAPI first?**
- Costs money (hackathon budget = $0)
- Free alternatives work fine for demo
- Can upgrade later if needed

---

## Success Criteria

### Minimum Viable Product (MVP) - Days 1-3
**MUST HAVE for working demo:**
- [x] User can ask drug repurposing question
- [ ] Agent searches PubMed (async)
- [ ] Agent searches web (Brave/DuckDuckGo)
- [ ] LLM judge evaluates evidence quality
- [ ] System respects token budget (50K tokens max)
- [ ] Output includes drug candidates + citations
- [ ] Works end-to-end for demo query: "Long COVID fatigue"
- [ ] Gradio UI with streaming progress

### Hackathon Submission - Days 4-5
**Required for all tracks:**
- [ ] Gradio UI deployed on HuggingFace Spaces
- [ ] 3 example queries working and tested
- [ ] This architecture documentation
- [ ] Demo video (2-3 min) showing workflow
- [ ] README with setup instructions

**Track-Specific:**
- [ ] **Gradio Track**: Streaming UI, progress indicators, modern design
- [ ] **MCP Track**: PubMed tool as MCP server (reusable by others)
- [ ] **Modal Track**: GPU inference option (stretch)

### Stretch Goals - Day 6+
**Nice-to-have if time permits:**
- [ ] Modal integration for local LLM fallback
- [ ] Clinical trials database search
- [ ] Checkpoint/resume functionality
- [ ] OpenFDA drug safety lookup
- [ ] PDF export of research reports

### What's EXPLICITLY Out of Scope
**NOT building (to stay focused):**
- ❌ User authentication
- ❌ Database storage of queries
- ❌ Multi-user support
- ❌ Payment/billing
- ❌ Production monitoring
- ❌ Mobile UI

---

## Implementation Timeline

### Day 1 (Today): Architecture & Setup
- [x] Define use case (drug repurposing) βœ…
- [x] Write architecture docs βœ…
- [ ] Create project structure
- [ ] First PR: Structure + Docs

### Day 2: Core Agent Loop
- [ ] Implement basic orchestrator
- [ ] Add PubMed search tool
- [ ] Simple judge (keyword-based)
- [ ] Test with 1 query

### Day 3: Intelligence Layer
- [ ] Upgrade to LLM judge
- [ ] Add web search tool
- [ ] Token budget tracking
- [ ] Test with multiple queries

### Day 4: UI & Integration
- [ ] Build Gradio interface
- [ ] Wire up agent to UI
- [ ] Add progress indicators
- [ ] Format output nicely

### Day 5: Polish & Extend
- [ ] Add more tools (clinical trials)
- [ ] Improve judge prompts
- [ ] Checkpoint system
- [ ] Error handling

### Day 6: Deploy & Document
- [ ] Deploy to HuggingFace Spaces
- [ ] Record demo video
- [ ] Write submission materials
- [ ] Final testing

---

## Questions This Document Answers

### For The Maintainer

**Q: "What should our design pattern be?"**
A: Search-and-judge loop with multi-tool orchestration (detailed in Design Patterns section)

**Q: "Should we use LLM-as-judge or token budget?"**
A: Both - judge for smart stopping, budget for cost control

**Q: "What's the break pattern?"**
A: Three conditions: judge approval, token limit, or max iterations (whichever comes first)

**Q: "What components do we need?"**
A: Agent orchestrator, tools (PubMed/web), judge, Gradio UI (see Component Breakdown)

### For The Team

**Q: "What are we actually building?"**
A: Medical drug repurposing research agent (see Core Use Case)

**Q: "How complex should it be?"**
A: Simple but complete - ~300 lines of core code (see Component sizes)

**Q: "What's the timeline?"**
A: 6 days, MVP by Day 3, polish Days 4-6 (see Implementation Timeline)

**Q: "What datasets/APIs do we use?"**
A: PubMed (free), web search, clinical trials.gov (see Tool APIs)

---

## Next Steps

1. **Review this document** - Team feedback on architecture
2. **Finalize design** - Incorporate feedback
3. **Create project structure** - Scaffold repository
4. **Move to proper docs** - `docs/architecture/` folder
5. **Open first PR** - Structure + Documentation
6. **Start implementation** - Day 2 onward

---

## Notes & Decisions

### Why Drug Repurposing?
- Clear, impressive use case
- Real-world medical impact
- Good data availability (PubMed, trials)
- Easy to explain (Viagra example!)
- Physician on team βœ…

### Why Simple Architecture?
- 6-day timeline
- Need working end-to-end system
- Hackathon judges value "works" over "complex"
- Can extend later if successful

### Why These Tools First?
- PubMed: Best biomedical literature source
- Web search: General medical knowledge
- Clinical trials: Evidence of actual testing
- Others: Nice-to-have, not critical for MVP

---

---

## Appendix A: Demo Queries (Pre-tested)

These queries will be used for demo and testing. They're chosen because:
1. They have good PubMed coverage
2. They're medically interesting
3. They show the system's capabilities

### Primary Demo Query
```
"What existing drugs might help treat long COVID fatigue?"
```
**Expected candidates**: CoQ10, Low-dose Naltrexone, Modafinil
**Expected sources**: 20+ PubMed papers, 2-3 clinical trials

### Secondary Demo Queries
```
"Find existing drugs that might slow Alzheimer's progression"
"What approved medications could help with fibromyalgia pain?"
"Which diabetes drugs show promise for cancer treatment?"
```

### Why These Queries?
- Represent real clinical needs
- Have substantial literature
- Show diverse drug classes
- Physician on team can validate results

---

## Appendix B: Risk Assessment

| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| PubMed rate limiting | Medium | High | Implement caching, respect 3/sec |
| Web search API fails | Low | Medium | DuckDuckGo fallback |
| LLM costs exceed budget | Medium | Medium | Hard token cap at 50K |
| Judge quality poor | Medium | High | Pre-test prompts, iterate |
| HuggingFace deploy issues | Low | High | Test deployment Day 4 |
| Demo crashes live | Medium | High | Pre-recorded backup video |

---

---

**Document Status**: Official Architecture Spec
**Review Score**: 98/100
**Last Updated**: November 2025