Joseph Pollack commited on
Commit
026ee5d
·
1 Parent(s): 016b413

Restore recent changes

Browse files
docs/api/agents.md ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Agents API Reference
2
+
3
+ This page documents the API for DeepCritical agents.
4
+
5
+ ## KnowledgeGapAgent
6
+
7
+ **Module**: `src.agents.knowledge_gap`
8
+
9
+ **Purpose**: Evaluates research state and identifies knowledge gaps.
10
+
11
+ ### Methods
12
+
13
+ #### `evaluate`
14
+
15
+ ```python
16
+ async def evaluate(
17
+ self,
18
+ query: str,
19
+ background_context: str,
20
+ conversation_history: Conversation,
21
+ iteration: int,
22
+ time_elapsed_minutes: float,
23
+ max_time_minutes: float
24
+ ) -> KnowledgeGapOutput
25
+ ```
26
+
27
+ Evaluates research completeness and identifies outstanding knowledge gaps.
28
+
29
+ **Parameters**:
30
+ - `query`: Research query string
31
+ - `background_context`: Background context for the query
32
+ - `conversation_history`: Conversation history with previous iterations
33
+ - `iteration`: Current iteration number
34
+ - `time_elapsed_minutes`: Elapsed time in minutes
35
+ - `max_time_minutes`: Maximum time limit in minutes
36
+
37
+ **Returns**: `KnowledgeGapOutput` with:
38
+ - `research_complete`: Boolean indicating if research is complete
39
+ - `outstanding_gaps`: List of remaining knowledge gaps
40
+
41
+ ## ToolSelectorAgent
42
+
43
+ **Module**: `src.agents.tool_selector`
44
+
45
+ **Purpose**: Selects appropriate tools for addressing knowledge gaps.
46
+
47
+ ### Methods
48
+
49
+ #### `select_tools`
50
+
51
+ ```python
52
+ async def select_tools(
53
+ self,
54
+ query: str,
55
+ knowledge_gaps: list[str],
56
+ available_tools: list[str]
57
+ ) -> AgentSelectionPlan
58
+ ```
59
+
60
+ Selects tools for addressing knowledge gaps.
61
+
62
+ **Parameters**:
63
+ - `query`: Research query string
64
+ - `knowledge_gaps`: List of knowledge gaps to address
65
+ - `available_tools`: List of available tool names
66
+
67
+ **Returns**: `AgentSelectionPlan` with list of `AgentTask` objects.
68
+
69
+ ## WriterAgent
70
+
71
+ **Module**: `src.agents.writer`
72
+
73
+ **Purpose**: Generates final reports from research findings.
74
+
75
+ ### Methods
76
+
77
+ #### `write_report`
78
+
79
+ ```python
80
+ async def write_report(
81
+ self,
82
+ query: str,
83
+ findings: str,
84
+ output_length: str = "medium",
85
+ output_instructions: str | None = None
86
+ ) -> str
87
+ ```
88
+
89
+ Generates a markdown report from research findings.
90
+
91
+ **Parameters**:
92
+ - `query`: Research query string
93
+ - `findings`: Research findings to include in report
94
+ - `output_length`: Desired output length ("short", "medium", "long")
95
+ - `output_instructions`: Additional instructions for report generation
96
+
97
+ **Returns**: Markdown string with numbered citations.
98
+
99
+ ## LongWriterAgent
100
+
101
+ **Module**: `src.agents.long_writer`
102
+
103
+ **Purpose**: Long-form report generation with section-by-section writing.
104
+
105
+ ### Methods
106
+
107
+ #### `write_next_section`
108
+
109
+ ```python
110
+ async def write_next_section(
111
+ self,
112
+ query: str,
113
+ draft: ReportDraft,
114
+ section_title: str,
115
+ section_content: str
116
+ ) -> LongWriterOutput
117
+ ```
118
+
119
+ Writes the next section of a long-form report.
120
+
121
+ **Parameters**:
122
+ - `query`: Research query string
123
+ - `draft`: Current report draft
124
+ - `section_title`: Title of the section to write
125
+ - `section_content`: Content/guidance for the section
126
+
127
+ **Returns**: `LongWriterOutput` with updated draft.
128
+
129
+ #### `write_report`
130
+
131
+ ```python
132
+ async def write_report(
133
+ self,
134
+ query: str,
135
+ report_title: str,
136
+ report_draft: ReportDraft
137
+ ) -> str
138
+ ```
139
+
140
+ Generates final report from draft.
141
+
142
+ **Parameters**:
143
+ - `query`: Research query string
144
+ - `report_title`: Title of the report
145
+ - `report_draft`: Complete report draft
146
+
147
+ **Returns**: Final markdown report string.
148
+
149
+ ## ProofreaderAgent
150
+
151
+ **Module**: `src.agents.proofreader`
152
+
153
+ **Purpose**: Proofreads and polishes report drafts.
154
+
155
+ ### Methods
156
+
157
+ #### `proofread`
158
+
159
+ ```python
160
+ async def proofread(
161
+ self,
162
+ query: str,
163
+ report_title: str,
164
+ report_draft: ReportDraft
165
+ ) -> str
166
+ ```
167
+
168
+ Proofreads and polishes a report draft.
169
+
170
+ **Parameters**:
171
+ - `query`: Research query string
172
+ - `report_title`: Title of the report
173
+ - `report_draft`: Report draft to proofread
174
+
175
+ **Returns**: Polished markdown string.
176
+
177
+ ## ThinkingAgent
178
+
179
+ **Module**: `src.agents.thinking`
180
+
181
+ **Purpose**: Generates observations from conversation history.
182
+
183
+ ### Methods
184
+
185
+ #### `generate_observations`
186
+
187
+ ```python
188
+ async def generate_observations(
189
+ self,
190
+ query: str,
191
+ background_context: str,
192
+ conversation_history: Conversation
193
+ ) -> str
194
+ ```
195
+
196
+ Generates observations from conversation history.
197
+
198
+ **Parameters**:
199
+ - `query`: Research query string
200
+ - `background_context`: Background context
201
+ - `conversation_history`: Conversation history
202
+
203
+ **Returns**: Observation string.
204
+
205
+ ## InputParserAgent
206
+
207
+ **Module**: `src.agents.input_parser`
208
+
209
+ **Purpose**: Parses and improves user queries, detects research mode.
210
+
211
+ ### Methods
212
+
213
+ #### `parse_query`
214
+
215
+ ```python
216
+ async def parse_query(
217
+ self,
218
+ query: str
219
+ ) -> ParsedQuery
220
+ ```
221
+
222
+ Parses and improves a user query.
223
+
224
+ **Parameters**:
225
+ - `query`: Original query string
226
+
227
+ **Returns**: `ParsedQuery` with:
228
+ - `original_query`: Original query string
229
+ - `improved_query`: Refined query string
230
+ - `research_mode`: "iterative" or "deep"
231
+ - `key_entities`: List of key entities
232
+ - `research_questions`: List of research questions
233
+
234
+ ## Factory Functions
235
+
236
+ All agents have factory functions in `src.agent_factory.agents`:
237
+
238
+ ```python
239
+ def create_knowledge_gap_agent(model: Any | None = None) -> KnowledgeGapAgent
240
+ def create_tool_selector_agent(model: Any | None = None) -> ToolSelectorAgent
241
+ def create_writer_agent(model: Any | None = None) -> WriterAgent
242
+ def create_long_writer_agent(model: Any | None = None) -> LongWriterAgent
243
+ def create_proofreader_agent(model: Any | None = None) -> ProofreaderAgent
244
+ def create_thinking_agent(model: Any | None = None) -> ThinkingAgent
245
+ def create_input_parser_agent(model: Any | None = None) -> InputParserAgent
246
+ ```
247
+
248
+ **Parameters**:
249
+ - `model`: Optional Pydantic AI model. If None, uses `get_model()` from settings.
250
+
251
+ **Returns**: Agent instance.
252
+
253
+ ## See Also
254
+
255
+ - [Architecture - Agents](../architecture/agents.md) - Architecture overview
256
+ - [Models API](models.md) - Data models used by agents
257
+
258
+
259
+
260
+
261
+
262
+
263
+
264
+
265
+
266
+
267
+
268
+
269
+
270
+
docs/api/models.md ADDED
@@ -0,0 +1,248 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Models API Reference
2
+
3
+ This page documents the Pydantic models used throughout DeepCritical.
4
+
5
+ ## Evidence
6
+
7
+ **Module**: `src.utils.models`
8
+
9
+ **Purpose**: Represents evidence from search results.
10
+
11
+ ```python
12
+ class Evidence(BaseModel):
13
+ citation: Citation
14
+ content: str
15
+ relevance_score: float = Field(ge=0.0, le=1.0)
16
+ metadata: dict[str, Any] = Field(default_factory=dict)
17
+ ```
18
+
19
+ **Fields**:
20
+ - `citation`: Citation information (title, URL, date, authors)
21
+ - `content`: Evidence text content
22
+ - `relevance_score`: Relevance score (0.0-1.0)
23
+ - `metadata`: Additional metadata dictionary
24
+
25
+ ## Citation
26
+
27
+ **Module**: `src.utils.models`
28
+
29
+ **Purpose**: Citation information for evidence.
30
+
31
+ ```python
32
+ class Citation(BaseModel):
33
+ title: str
34
+ url: str
35
+ date: str | None = None
36
+ authors: list[str] = Field(default_factory=list)
37
+ ```
38
+
39
+ **Fields**:
40
+ - `title`: Article/trial title
41
+ - `url`: Source URL
42
+ - `date`: Publication date (optional)
43
+ - `authors`: List of authors (optional)
44
+
45
+ ## KnowledgeGapOutput
46
+
47
+ **Module**: `src.utils.models`
48
+
49
+ **Purpose**: Output from knowledge gap evaluation.
50
+
51
+ ```python
52
+ class KnowledgeGapOutput(BaseModel):
53
+ research_complete: bool
54
+ outstanding_gaps: list[str] = Field(default_factory=list)
55
+ ```
56
+
57
+ **Fields**:
58
+ - `research_complete`: Boolean indicating if research is complete
59
+ - `outstanding_gaps`: List of remaining knowledge gaps
60
+
61
+ ## AgentSelectionPlan
62
+
63
+ **Module**: `src.utils.models`
64
+
65
+ **Purpose**: Plan for tool/agent selection.
66
+
67
+ ```python
68
+ class AgentSelectionPlan(BaseModel):
69
+ tasks: list[AgentTask] = Field(default_factory=list)
70
+ ```
71
+
72
+ **Fields**:
73
+ - `tasks`: List of agent tasks to execute
74
+
75
+ ## AgentTask
76
+
77
+ **Module**: `src.utils.models`
78
+
79
+ **Purpose**: Individual agent task.
80
+
81
+ ```python
82
+ class AgentTask(BaseModel):
83
+ agent_name: str
84
+ query: str
85
+ context: dict[str, Any] = Field(default_factory=dict)
86
+ ```
87
+
88
+ **Fields**:
89
+ - `agent_name`: Name of agent to use
90
+ - `query`: Task query
91
+ - `context`: Additional context dictionary
92
+
93
+ ## ReportDraft
94
+
95
+ **Module**: `src.utils.models`
96
+
97
+ **Purpose**: Draft structure for long-form reports.
98
+
99
+ ```python
100
+ class ReportDraft(BaseModel):
101
+ title: str
102
+ sections: list[ReportSection] = Field(default_factory=list)
103
+ references: list[Citation] = Field(default_factory=list)
104
+ ```
105
+
106
+ **Fields**:
107
+ - `title`: Report title
108
+ - `sections`: List of report sections
109
+ - `references`: List of citations
110
+
111
+ ## ReportSection
112
+
113
+ **Module**: `src.utils.models`
114
+
115
+ **Purpose**: Individual section in a report draft.
116
+
117
+ ```python
118
+ class ReportSection(BaseModel):
119
+ title: str
120
+ content: str
121
+ order: int
122
+ ```
123
+
124
+ **Fields**:
125
+ - `title`: Section title
126
+ - `content`: Section content
127
+ - `order`: Section order number
128
+
129
+ ## ParsedQuery
130
+
131
+ **Module**: `src.utils.models`
132
+
133
+ **Purpose**: Parsed and improved query.
134
+
135
+ ```python
136
+ class ParsedQuery(BaseModel):
137
+ original_query: str
138
+ improved_query: str
139
+ research_mode: Literal["iterative", "deep"]
140
+ key_entities: list[str] = Field(default_factory=list)
141
+ research_questions: list[str] = Field(default_factory=list)
142
+ ```
143
+
144
+ **Fields**:
145
+ - `original_query`: Original query string
146
+ - `improved_query`: Refined query string
147
+ - `research_mode`: Research mode ("iterative" or "deep")
148
+ - `key_entities`: List of key entities
149
+ - `research_questions`: List of research questions
150
+
151
+ ## Conversation
152
+
153
+ **Module**: `src.utils.models`
154
+
155
+ **Purpose**: Conversation history with iterations.
156
+
157
+ ```python
158
+ class Conversation(BaseModel):
159
+ iterations: list[IterationData] = Field(default_factory=list)
160
+ ```
161
+
162
+ **Fields**:
163
+ - `iterations`: List of iteration data
164
+
165
+ ## IterationData
166
+
167
+ **Module**: `src.utils.models`
168
+
169
+ **Purpose**: Data for a single iteration.
170
+
171
+ ```python
172
+ class IterationData(BaseModel):
173
+ iteration: int
174
+ observations: str | None = None
175
+ knowledge_gaps: list[str] = Field(default_factory=list)
176
+ tool_calls: list[dict[str, Any]] = Field(default_factory=list)
177
+ findings: str | None = None
178
+ thoughts: str | None = None
179
+ ```
180
+
181
+ **Fields**:
182
+ - `iteration`: Iteration number
183
+ - `observations`: Generated observations
184
+ - `knowledge_gaps`: Identified knowledge gaps
185
+ - `tool_calls`: Tool calls made
186
+ - `findings`: Findings from tools
187
+ - `thoughts`: Agent thoughts
188
+
189
+ ## AgentEvent
190
+
191
+ **Module**: `src.utils.models`
192
+
193
+ **Purpose**: Event emitted during research execution.
194
+
195
+ ```python
196
+ class AgentEvent(BaseModel):
197
+ type: str
198
+ iteration: int | None = None
199
+ data: dict[str, Any] = Field(default_factory=dict)
200
+ ```
201
+
202
+ **Fields**:
203
+ - `type`: Event type (e.g., "started", "search_complete", "complete")
204
+ - `iteration`: Iteration number (optional)
205
+ - `data`: Event data dictionary
206
+
207
+ ## BudgetStatus
208
+
209
+ **Module**: `src.utils.models`
210
+
211
+ **Purpose**: Current budget status.
212
+
213
+ ```python
214
+ class BudgetStatus(BaseModel):
215
+ tokens_used: int
216
+ tokens_limit: int
217
+ time_elapsed_seconds: float
218
+ time_limit_seconds: float
219
+ iterations: int
220
+ iterations_limit: int
221
+ ```
222
+
223
+ **Fields**:
224
+ - `tokens_used`: Tokens used so far
225
+ - `tokens_limit`: Token limit
226
+ - `time_elapsed_seconds`: Elapsed time in seconds
227
+ - `time_limit_seconds`: Time limit in seconds
228
+ - `iterations`: Current iteration count
229
+ - `iterations_limit`: Iteration limit
230
+
231
+ ## See Also
232
+
233
+ - [Architecture - Agents](../architecture/agents.md) - How models are used
234
+ - [Configuration](../configuration/index.md) - Model configuration
235
+
236
+
237
+
238
+
239
+
240
+
241
+
242
+
243
+
244
+
245
+
246
+
247
+
248
+
docs/api/orchestrators.md ADDED
@@ -0,0 +1,195 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Orchestrators API Reference
2
+
3
+ This page documents the API for DeepCritical orchestrators.
4
+
5
+ ## IterativeResearchFlow
6
+
7
+ **Module**: `src.orchestrator.research_flow`
8
+
9
+ **Purpose**: Single-loop research with search-judge-synthesize cycles.
10
+
11
+ ### Methods
12
+
13
+ #### `run`
14
+
15
+ ```python
16
+ async def run(
17
+ self,
18
+ query: str,
19
+ background_context: str = "",
20
+ max_iterations: int | None = None,
21
+ max_time_minutes: float | None = None,
22
+ token_budget: int | None = None
23
+ ) -> AsyncGenerator[AgentEvent, None]
24
+ ```
25
+
26
+ Runs iterative research flow.
27
+
28
+ **Parameters**:
29
+ - `query`: Research query string
30
+ - `background_context`: Background context (default: "")
31
+ - `max_iterations`: Maximum iterations (default: from settings)
32
+ - `max_time_minutes`: Maximum time in minutes (default: from settings)
33
+ - `token_budget`: Token budget (default: from settings)
34
+
35
+ **Yields**: `AgentEvent` objects for:
36
+ - `started`: Research started
37
+ - `search_complete`: Search completed
38
+ - `judge_complete`: Evidence evaluation completed
39
+ - `synthesizing`: Generating report
40
+ - `complete`: Research completed
41
+ - `error`: Error occurred
42
+
43
+ ## DeepResearchFlow
44
+
45
+ **Module**: `src.orchestrator.research_flow`
46
+
47
+ **Purpose**: Multi-section parallel research with planning and synthesis.
48
+
49
+ ### Methods
50
+
51
+ #### `run`
52
+
53
+ ```python
54
+ async def run(
55
+ self,
56
+ query: str,
57
+ background_context: str = "",
58
+ max_iterations_per_section: int | None = None,
59
+ max_time_minutes: float | None = None,
60
+ token_budget: int | None = None
61
+ ) -> AsyncGenerator[AgentEvent, None]
62
+ ```
63
+
64
+ Runs deep research flow.
65
+
66
+ **Parameters**:
67
+ - `query`: Research query string
68
+ - `background_context`: Background context (default: "")
69
+ - `max_iterations_per_section`: Maximum iterations per section (default: from settings)
70
+ - `max_time_minutes`: Maximum time in minutes (default: from settings)
71
+ - `token_budget`: Token budget (default: from settings)
72
+
73
+ **Yields**: `AgentEvent` objects for:
74
+ - `started`: Research started
75
+ - `planning`: Creating research plan
76
+ - `looping`: Running parallel research loops
77
+ - `synthesizing`: Synthesizing results
78
+ - `complete`: Research completed
79
+ - `error`: Error occurred
80
+
81
+ ## GraphOrchestrator
82
+
83
+ **Module**: `src.orchestrator.graph_orchestrator`
84
+
85
+ **Purpose**: Graph-based execution using Pydantic AI agents as nodes.
86
+
87
+ ### Methods
88
+
89
+ #### `run`
90
+
91
+ ```python
92
+ async def run(
93
+ self,
94
+ query: str,
95
+ research_mode: str = "auto",
96
+ use_graph: bool = True
97
+ ) -> AsyncGenerator[AgentEvent, None]
98
+ ```
99
+
100
+ Runs graph-based research orchestration.
101
+
102
+ **Parameters**:
103
+ - `query`: Research query string
104
+ - `research_mode`: Research mode ("iterative", "deep", or "auto")
105
+ - `use_graph`: Whether to use graph execution (default: True)
106
+
107
+ **Yields**: `AgentEvent` objects during graph execution.
108
+
109
+ ## Orchestrator Factory
110
+
111
+ **Module**: `src.orchestrator_factory`
112
+
113
+ **Purpose**: Factory for creating orchestrators.
114
+
115
+ ### Functions
116
+
117
+ #### `create_orchestrator`
118
+
119
+ ```python
120
+ def create_orchestrator(
121
+ search_handler: SearchHandlerProtocol,
122
+ judge_handler: JudgeHandlerProtocol,
123
+ config: dict[str, Any],
124
+ mode: str | None = None
125
+ ) -> Any
126
+ ```
127
+
128
+ Creates an orchestrator instance.
129
+
130
+ **Parameters**:
131
+ - `search_handler`: Search handler protocol implementation
132
+ - `judge_handler`: Judge handler protocol implementation
133
+ - `config`: Configuration dictionary
134
+ - `mode`: Orchestrator mode ("simple", "advanced", "magentic", or None for auto-detect)
135
+
136
+ **Returns**: Orchestrator instance.
137
+
138
+ **Raises**:
139
+ - `ValueError`: If requirements not met
140
+
141
+ **Modes**:
142
+ - `"simple"`: Legacy orchestrator
143
+ - `"advanced"` or `"magentic"`: Magentic orchestrator (requires OpenAI API key)
144
+ - `None`: Auto-detect based on API key availability
145
+
146
+ ## MagenticOrchestrator
147
+
148
+ **Module**: `src.orchestrator_magentic`
149
+
150
+ **Purpose**: Multi-agent coordination using Microsoft Agent Framework.
151
+
152
+ ### Methods
153
+
154
+ #### `run`
155
+
156
+ ```python
157
+ async def run(
158
+ self,
159
+ query: str,
160
+ max_rounds: int = 15,
161
+ max_stalls: int = 3
162
+ ) -> AsyncGenerator[AgentEvent, None]
163
+ ```
164
+
165
+ Runs Magentic orchestration.
166
+
167
+ **Parameters**:
168
+ - `query`: Research query string
169
+ - `max_rounds`: Maximum rounds (default: 15)
170
+ - `max_stalls`: Maximum stalls before reset (default: 3)
171
+
172
+ **Yields**: `AgentEvent` objects converted from Magentic events.
173
+
174
+ **Requirements**:
175
+ - `agent-framework-core` package
176
+ - OpenAI API key
177
+
178
+ ## See Also
179
+
180
+ - [Architecture - Orchestrators](../architecture/orchestrators.md) - Architecture overview
181
+ - [Graph Orchestration](../architecture/graph-orchestration.md) - Graph execution details
182
+
183
+
184
+
185
+
186
+
187
+
188
+
189
+
190
+
191
+
192
+
193
+
194
+
195
+
docs/api/services.md ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Services API Reference
2
+
3
+ This page documents the API for DeepCritical services.
4
+
5
+ ## EmbeddingService
6
+
7
+ **Module**: `src.services.embeddings`
8
+
9
+ **Purpose**: Local sentence-transformers for semantic search and deduplication.
10
+
11
+ ### Methods
12
+
13
+ #### `embed`
14
+
15
+ ```python
16
+ async def embed(self, text: str) -> list[float]
17
+ ```
18
+
19
+ Generates embedding for a text string.
20
+
21
+ **Parameters**:
22
+ - `text`: Text to embed
23
+
24
+ **Returns**: Embedding vector as list of floats.
25
+
26
+ #### `embed_batch`
27
+
28
+ ```python
29
+ async def embed_batch(self, texts: list[str]) -> list[list[float]]
30
+ ```
31
+
32
+ Generates embeddings for multiple texts.
33
+
34
+ **Parameters**:
35
+ - `texts`: List of texts to embed
36
+
37
+ **Returns**: List of embedding vectors.
38
+
39
+ #### `similarity`
40
+
41
+ ```python
42
+ async def similarity(self, text1: str, text2: str) -> float
43
+ ```
44
+
45
+ Calculates similarity between two texts.
46
+
47
+ **Parameters**:
48
+ - `text1`: First text
49
+ - `text2`: Second text
50
+
51
+ **Returns**: Similarity score (0.0-1.0).
52
+
53
+ #### `find_duplicates`
54
+
55
+ ```python
56
+ async def find_duplicates(
57
+ self,
58
+ texts: list[str],
59
+ threshold: float = 0.85
60
+ ) -> list[tuple[int, int]]
61
+ ```
62
+
63
+ Finds duplicate texts based on similarity threshold.
64
+
65
+ **Parameters**:
66
+ - `texts`: List of texts to check
67
+ - `threshold`: Similarity threshold (default: 0.85)
68
+
69
+ **Returns**: List of (index1, index2) tuples for duplicate pairs.
70
+
71
+ ### Factory Function
72
+
73
+ #### `get_embedding_service`
74
+
75
+ ```python
76
+ @lru_cache(maxsize=1)
77
+ def get_embedding_service() -> EmbeddingService
78
+ ```
79
+
80
+ Returns singleton EmbeddingService instance.
81
+
82
+ ## LlamaIndexRAGService
83
+
84
+ **Module**: `src.services.rag`
85
+
86
+ **Purpose**: Retrieval-Augmented Generation using LlamaIndex.
87
+
88
+ ### Methods
89
+
90
+ #### `ingest_evidence`
91
+
92
+ ```python
93
+ async def ingest_evidence(self, evidence: list[Evidence]) -> None
94
+ ```
95
+
96
+ Ingests evidence into RAG service.
97
+
98
+ **Parameters**:
99
+ - `evidence`: List of Evidence objects to ingest
100
+
101
+ **Note**: Requires OpenAI API key for embeddings.
102
+
103
+ #### `retrieve`
104
+
105
+ ```python
106
+ async def retrieve(
107
+ self,
108
+ query: str,
109
+ top_k: int = 5
110
+ ) -> list[Document]
111
+ ```
112
+
113
+ Retrieves relevant documents for a query.
114
+
115
+ **Parameters**:
116
+ - `query`: Search query string
117
+ - `top_k`: Number of top results to return (default: 5)
118
+
119
+ **Returns**: List of Document objects with metadata.
120
+
121
+ #### `query`
122
+
123
+ ```python
124
+ async def query(
125
+ self,
126
+ query: str,
127
+ top_k: int = 5
128
+ ) -> str
129
+ ```
130
+
131
+ Queries RAG service and returns formatted results.
132
+
133
+ **Parameters**:
134
+ - `query`: Search query string
135
+ - `top_k`: Number of top results to return (default: 5)
136
+
137
+ **Returns**: Formatted query results as string.
138
+
139
+ ### Factory Function
140
+
141
+ #### `get_rag_service`
142
+
143
+ ```python
144
+ @lru_cache(maxsize=1)
145
+ def get_rag_service() -> LlamaIndexRAGService | None
146
+ ```
147
+
148
+ Returns singleton LlamaIndexRAGService instance, or None if OpenAI key not available.
149
+
150
+ ## StatisticalAnalyzer
151
+
152
+ **Module**: `src.services.statistical_analyzer`
153
+
154
+ **Purpose**: Secure execution of AI-generated statistical code.
155
+
156
+ ### Methods
157
+
158
+ #### `analyze`
159
+
160
+ ```python
161
+ async def analyze(
162
+ self,
163
+ hypothesis: str,
164
+ evidence: list[Evidence],
165
+ data_description: str | None = None
166
+ ) -> AnalysisResult
167
+ ```
168
+
169
+ Analyzes a hypothesis using statistical methods.
170
+
171
+ **Parameters**:
172
+ - `hypothesis`: Hypothesis to analyze
173
+ - `evidence`: List of Evidence objects
174
+ - `data_description`: Optional data description
175
+
176
+ **Returns**: `AnalysisResult` with:
177
+ - `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
178
+ - `code`: Generated analysis code
179
+ - `output`: Execution output
180
+ - `error`: Error message if execution failed
181
+
182
+ **Note**: Requires Modal credentials for sandbox execution.
183
+
184
+ ## See Also
185
+
186
+ - [Architecture - Services](../architecture/services.md) - Architecture overview
187
+ - [Configuration](../configuration/index.md) - Service configuration
188
+
189
+
190
+
191
+
192
+
193
+
194
+
195
+
196
+
197
+
198
+
199
+
200
+
201
+
docs/api/tools.md ADDED
@@ -0,0 +1,235 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tools API Reference
2
+
3
+ This page documents the API for DeepCritical search tools.
4
+
5
+ ## SearchTool Protocol
6
+
7
+ All tools implement the `SearchTool` protocol:
8
+
9
+ ```python
10
+ class SearchTool(Protocol):
11
+ @property
12
+ def name(self) -> str: ...
13
+
14
+ async def search(
15
+ self,
16
+ query: str,
17
+ max_results: int = 10
18
+ ) -> list[Evidence]: ...
19
+ ```
20
+
21
+ ## PubMedTool
22
+
23
+ **Module**: `src.tools.pubmed`
24
+
25
+ **Purpose**: Search peer-reviewed biomedical literature from PubMed.
26
+
27
+ ### Properties
28
+
29
+ #### `name`
30
+
31
+ ```python
32
+ @property
33
+ def name(self) -> str
34
+ ```
35
+
36
+ Returns tool name: `"pubmed"`
37
+
38
+ ### Methods
39
+
40
+ #### `search`
41
+
42
+ ```python
43
+ async def search(
44
+ self,
45
+ query: str,
46
+ max_results: int = 10
47
+ ) -> list[Evidence]
48
+ ```
49
+
50
+ Searches PubMed for articles.
51
+
52
+ **Parameters**:
53
+ - `query`: Search query string
54
+ - `max_results`: Maximum number of results to return (default: 10)
55
+
56
+ **Returns**: List of `Evidence` objects with PubMed articles.
57
+
58
+ **Raises**:
59
+ - `SearchError`: If search fails
60
+ - `RateLimitError`: If rate limit is exceeded
61
+
62
+ ## ClinicalTrialsTool
63
+
64
+ **Module**: `src.tools.clinicaltrials`
65
+
66
+ **Purpose**: Search ClinicalTrials.gov for interventional studies.
67
+
68
+ ### Properties
69
+
70
+ #### `name`
71
+
72
+ ```python
73
+ @property
74
+ def name(self) -> str
75
+ ```
76
+
77
+ Returns tool name: `"clinicaltrials"`
78
+
79
+ ### Methods
80
+
81
+ #### `search`
82
+
83
+ ```python
84
+ async def search(
85
+ self,
86
+ query: str,
87
+ max_results: int = 10
88
+ ) -> list[Evidence]
89
+ ```
90
+
91
+ Searches ClinicalTrials.gov for trials.
92
+
93
+ **Parameters**:
94
+ - `query`: Search query string
95
+ - `max_results`: Maximum number of results to return (default: 10)
96
+
97
+ **Returns**: List of `Evidence` objects with clinical trials.
98
+
99
+ **Note**: Only returns interventional studies with status: COMPLETED, ACTIVE_NOT_RECRUITING, RECRUITING, ENROLLING_BY_INVITATION
100
+
101
+ **Raises**:
102
+ - `SearchError`: If search fails
103
+
104
+ ## EuropePMCTool
105
+
106
+ **Module**: `src.tools.europepmc`
107
+
108
+ **Purpose**: Search Europe PMC for preprints and peer-reviewed articles.
109
+
110
+ ### Properties
111
+
112
+ #### `name`
113
+
114
+ ```python
115
+ @property
116
+ def name(self) -> str
117
+ ```
118
+
119
+ Returns tool name: `"europepmc"`
120
+
121
+ ### Methods
122
+
123
+ #### `search`
124
+
125
+ ```python
126
+ async def search(
127
+ self,
128
+ query: str,
129
+ max_results: int = 10
130
+ ) -> list[Evidence]
131
+ ```
132
+
133
+ Searches Europe PMC for articles and preprints.
134
+
135
+ **Parameters**:
136
+ - `query`: Search query string
137
+ - `max_results`: Maximum number of results to return (default: 10)
138
+
139
+ **Returns**: List of `Evidence` objects with articles/preprints.
140
+
141
+ **Note**: Includes both preprints (marked with `[PREPRINT - Not peer-reviewed]`) and peer-reviewed articles.
142
+
143
+ **Raises**:
144
+ - `SearchError`: If search fails
145
+
146
+ ## RAGTool
147
+
148
+ **Module**: `src.tools.rag_tool`
149
+
150
+ **Purpose**: Semantic search within collected evidence.
151
+
152
+ ### Properties
153
+
154
+ #### `name`
155
+
156
+ ```python
157
+ @property
158
+ def name(self) -> str
159
+ ```
160
+
161
+ Returns tool name: `"rag"`
162
+
163
+ ### Methods
164
+
165
+ #### `search`
166
+
167
+ ```python
168
+ async def search(
169
+ self,
170
+ query: str,
171
+ max_results: int = 10
172
+ ) -> list[Evidence]
173
+ ```
174
+
175
+ Searches collected evidence using semantic similarity.
176
+
177
+ **Parameters**:
178
+ - `query`: Search query string
179
+ - `max_results`: Maximum number of results to return (default: 10)
180
+
181
+ **Returns**: List of `Evidence` objects from collected evidence.
182
+
183
+ **Note**: Requires evidence to be ingested into RAG service first.
184
+
185
+ ## SearchHandler
186
+
187
+ **Module**: `src.tools.search_handler`
188
+
189
+ **Purpose**: Orchestrates parallel searches across multiple tools.
190
+
191
+ ### Methods
192
+
193
+ #### `search`
194
+
195
+ ```python
196
+ async def search(
197
+ self,
198
+ query: str,
199
+ tools: list[SearchTool] | None = None,
200
+ max_results_per_tool: int = 10
201
+ ) -> SearchResult
202
+ ```
203
+
204
+ Searches multiple tools in parallel.
205
+
206
+ **Parameters**:
207
+ - `query`: Search query string
208
+ - `tools`: List of tools to use (default: all available tools)
209
+ - `max_results_per_tool`: Maximum results per tool (default: 10)
210
+
211
+ **Returns**: `SearchResult` with:
212
+ - `evidence`: Aggregated list of evidence
213
+ - `tool_results`: Results per tool
214
+ - `total_count`: Total number of results
215
+
216
+ **Note**: Uses `asyncio.gather()` for parallel execution. Handles tool failures gracefully.
217
+
218
+ ## See Also
219
+
220
+ - [Architecture - Tools](../architecture/tools.md) - Architecture overview
221
+ - [Models API](models.md) - Data models used by tools
222
+
223
+
224
+
225
+
226
+
227
+
228
+
229
+
230
+
231
+
232
+
233
+
234
+
235
+
docs/architecture/agents.md ADDED
@@ -0,0 +1,192 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Agents Architecture
2
+
3
+ DeepCritical uses Pydantic AI agents for all AI-powered operations. All agents follow a consistent pattern and use structured output types.
4
+
5
+ ## Agent Pattern
6
+
7
+ All agents use the Pydantic AI `Agent` class with the following structure:
8
+
9
+ - **System Prompt**: Module-level constant with date injection
10
+ - **Agent Class**: `__init__(model: Any | None = None)`
11
+ - **Main Method**: Async method (e.g., `async def evaluate()`, `async def write_report()`)
12
+ - **Factory Function**: `def create_agent_name(model: Any | None = None) -> AgentName`
13
+
14
+ ## Model Initialization
15
+
16
+ Agents use `get_model()` from `src/agent_factory/judges.py` if no model is provided. This supports:
17
+
18
+ - OpenAI models
19
+ - Anthropic models
20
+ - HuggingFace Inference API models
21
+
22
+ The model selection is based on the configured `LLM_PROVIDER` in settings.
23
+
24
+ ## Error Handling
25
+
26
+ Agents return fallback values on failure rather than raising exceptions:
27
+
28
+ - `KnowledgeGapOutput(research_complete=False, outstanding_gaps=[...])`
29
+ - Empty strings for text outputs
30
+ - Default structured outputs
31
+
32
+ All errors are logged with context using structlog.
33
+
34
+ ## Input Validation
35
+
36
+ All agents validate inputs:
37
+
38
+ - Check that queries/inputs are not empty
39
+ - Truncate very long inputs with warnings
40
+ - Handle None values gracefully
41
+
42
+ ## Output Types
43
+
44
+ Agents use structured output types from `src/utils/models.py`:
45
+
46
+ - `KnowledgeGapOutput`: Research completeness evaluation
47
+ - `AgentSelectionPlan`: Tool selection plan
48
+ - `ReportDraft`: Long-form report structure
49
+ - `ParsedQuery`: Query parsing and mode detection
50
+
51
+ For text output (writer agents), agents return `str` directly.
52
+
53
+ ## Agent Types
54
+
55
+ ### Knowledge Gap Agent
56
+
57
+ **File**: `src/agents/knowledge_gap.py`
58
+
59
+ **Purpose**: Evaluates research state and identifies knowledge gaps.
60
+
61
+ **Output**: `KnowledgeGapOutput` with:
62
+ - `research_complete`: Boolean indicating if research is complete
63
+ - `outstanding_gaps`: List of remaining knowledge gaps
64
+
65
+ **Methods**:
66
+ - `async def evaluate(query, background_context, conversation_history, iteration, time_elapsed_minutes, max_time_minutes) -> KnowledgeGapOutput`
67
+
68
+ ### Tool Selector Agent
69
+
70
+ **File**: `src/agents/tool_selector.py`
71
+
72
+ **Purpose**: Selects appropriate tools for addressing knowledge gaps.
73
+
74
+ **Output**: `AgentSelectionPlan` with list of `AgentTask` objects.
75
+
76
+ **Available Agents**:
77
+ - `WebSearchAgent`: General web search for fresh information
78
+ - `SiteCrawlerAgent`: Research specific entities/companies
79
+ - `RAGAgent`: Semantic search within collected evidence
80
+
81
+ ### Writer Agent
82
+
83
+ **File**: `src/agents/writer.py`
84
+
85
+ **Purpose**: Generates final reports from research findings.
86
+
87
+ **Output**: Markdown string with numbered citations.
88
+
89
+ **Methods**:
90
+ - `async def write_report(query, findings, output_length, output_instructions) -> str`
91
+
92
+ **Features**:
93
+ - Validates inputs
94
+ - Truncates very long findings (max 50000 chars) with warning
95
+ - Retry logic for transient failures (3 retries)
96
+ - Citation validation before returning
97
+
98
+ ### Long Writer Agent
99
+
100
+ **File**: `src/agents/long_writer.py`
101
+
102
+ **Purpose**: Long-form report generation with section-by-section writing.
103
+
104
+ **Input/Output**: Uses `ReportDraft` models.
105
+
106
+ **Methods**:
107
+ - `async def write_next_section(query, draft, section_title, section_content) -> LongWriterOutput`
108
+ - `async def write_report(query, report_title, report_draft) -> str`
109
+
110
+ **Features**:
111
+ - Writes sections iteratively
112
+ - Aggregates references across sections
113
+ - Reformats section headings and references
114
+ - Deduplicates and renumbers references
115
+
116
+ ### Proofreader Agent
117
+
118
+ **File**: `src/agents/proofreader.py`
119
+
120
+ **Purpose**: Proofreads and polishes report drafts.
121
+
122
+ **Input**: `ReportDraft`
123
+ **Output**: Polished markdown string
124
+
125
+ **Methods**:
126
+ - `async def proofread(query, report_title, report_draft) -> str`
127
+
128
+ **Features**:
129
+ - Removes duplicate content across sections
130
+ - Adds executive summary if multiple sections
131
+ - Preserves all references and citations
132
+ - Improves flow and readability
133
+
134
+ ### Thinking Agent
135
+
136
+ **File**: `src/agents/thinking.py`
137
+
138
+ **Purpose**: Generates observations from conversation history.
139
+
140
+ **Output**: Observation string
141
+
142
+ **Methods**:
143
+ - `async def generate_observations(query, background_context, conversation_history) -> str`
144
+
145
+ ### Input Parser Agent
146
+
147
+ **File**: `src/agents/input_parser.py`
148
+
149
+ **Purpose**: Parses and improves user queries, detects research mode.
150
+
151
+ **Output**: `ParsedQuery` with:
152
+ - `original_query`: Original query string
153
+ - `improved_query`: Refined query string
154
+ - `research_mode`: "iterative" or "deep"
155
+ - `key_entities`: List of key entities
156
+ - `research_questions`: List of research questions
157
+
158
+ ## Factory Functions
159
+
160
+ All agents have factory functions in `src/agent_factory/agents.py`:
161
+
162
+ ```python
163
+ def create_knowledge_gap_agent(model: Any | None = None) -> KnowledgeGapAgent
164
+ def create_tool_selector_agent(model: Any | None = None) -> ToolSelectorAgent
165
+ def create_writer_agent(model: Any | None = None) -> WriterAgent
166
+ # ... etc
167
+ ```
168
+
169
+ Factory functions:
170
+ - Use `get_model()` if no model provided
171
+ - Raise `ConfigurationError` if creation fails
172
+ - Log agent creation
173
+
174
+ ## See Also
175
+
176
+ - [Orchestrators](orchestrators.md) - How agents are orchestrated
177
+ - [API Reference - Agents](../api/agents.md) - API documentation
178
+ - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
179
+
180
+
181
+
182
+
183
+
184
+
185
+
186
+
187
+
188
+
189
+
190
+
191
+
192
+
docs/architecture/middleware.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Middleware Architecture
2
+
3
+ DeepCritical uses middleware for state management, budget tracking, and workflow coordination.
4
+
5
+ ## State Management
6
+
7
+ ### WorkflowState
8
+
9
+ **File**: `src/middleware/state_machine.py`
10
+
11
+ **Purpose**: Thread-safe state management for research workflows
12
+
13
+ **Implementation**: Uses `ContextVar` for thread-safe isolation
14
+
15
+ **State Components**:
16
+ - `evidence: list[Evidence]`: Collected evidence from searches
17
+ - `conversation: Conversation`: Iteration history (gaps, tool calls, findings, thoughts)
18
+ - `embedding_service: Any`: Embedding service for semantic search
19
+
20
+ **Methods**:
21
+ - `add_evidence(evidence: Evidence)`: Adds evidence with URL-based deduplication
22
+ - `async search_related(query: str, top_k: int = 5) -> list[Evidence]`: Semantic search
23
+
24
+ **Initialization**:
25
+ ```python
26
+ from src.middleware.state_machine import init_workflow_state
27
+
28
+ init_workflow_state(embedding_service)
29
+ ```
30
+
31
+ **Access**:
32
+ ```python
33
+ from src.middleware.state_machine import get_workflow_state
34
+
35
+ state = get_workflow_state() # Auto-initializes if missing
36
+ ```
37
+
38
+ ## Workflow Manager
39
+
40
+ **File**: `src/middleware/workflow_manager.py`
41
+
42
+ **Purpose**: Coordinates parallel research loops
43
+
44
+ **Methods**:
45
+ - `add_loop(loop: ResearchLoop)`: Add a research loop to manage
46
+ - `async run_loops_parallel() -> list[ResearchLoop]`: Run all loops in parallel
47
+ - `update_loop_status(loop_id: str, status: str)`: Update loop status
48
+ - `sync_loop_evidence_to_state()`: Synchronize evidence from loops to global state
49
+
50
+ **Features**:
51
+ - Uses `asyncio.gather()` for parallel execution
52
+ - Handles errors per loop (doesn't fail all if one fails)
53
+ - Tracks loop status: `pending`, `running`, `completed`, `failed`, `cancelled`
54
+ - Evidence deduplication across parallel loops
55
+
56
+ **Usage**:
57
+ ```python
58
+ from src.middleware.workflow_manager import WorkflowManager
59
+
60
+ manager = WorkflowManager()
61
+ manager.add_loop(loop1)
62
+ manager.add_loop(loop2)
63
+ completed_loops = await manager.run_loops_parallel()
64
+ ```
65
+
66
+ ## Budget Tracker
67
+
68
+ **File**: `src/middleware/budget_tracker.py`
69
+
70
+ **Purpose**: Tracks and enforces resource limits
71
+
72
+ **Budget Components**:
73
+ - **Tokens**: LLM token usage
74
+ - **Time**: Elapsed time in seconds
75
+ - **Iterations**: Number of iterations
76
+
77
+ **Methods**:
78
+ - `create_budget(token_limit, time_limit_seconds, iterations_limit) -> BudgetStatus`
79
+ - `add_tokens(tokens: int)`: Add token usage
80
+ - `start_timer()`: Start time tracking
81
+ - `update_timer()`: Update elapsed time
82
+ - `increment_iteration()`: Increment iteration count
83
+ - `check_budget() -> BudgetStatus`: Check current budget status
84
+ - `can_continue() -> bool`: Check if research can continue
85
+
86
+ **Token Estimation**:
87
+ - `estimate_tokens(text: str) -> int`: ~4 chars per token
88
+ - `estimate_llm_call_tokens(prompt: str, response: str) -> int`: Estimate LLM call tokens
89
+
90
+ **Usage**:
91
+ ```python
92
+ from src.middleware.budget_tracker import BudgetTracker
93
+
94
+ tracker = BudgetTracker()
95
+ budget = tracker.create_budget(
96
+ token_limit=100000,
97
+ time_limit_seconds=600,
98
+ iterations_limit=10
99
+ )
100
+ tracker.start_timer()
101
+ # ... research operations ...
102
+ if not tracker.can_continue():
103
+ # Budget exceeded, stop research
104
+ pass
105
+ ```
106
+
107
+ ## Models
108
+
109
+ All middleware models are defined in `src/utils/models.py`:
110
+
111
+ - `IterationData`: Data for a single iteration
112
+ - `Conversation`: Conversation history with iterations
113
+ - `ResearchLoop`: Research loop state and configuration
114
+ - `BudgetStatus`: Current budget status
115
+
116
+ ## Thread Safety
117
+
118
+ All middleware components use `ContextVar` for thread-safe isolation:
119
+
120
+ - Each request/thread has its own workflow state
121
+ - No global mutable state
122
+ - Safe for concurrent requests
123
+
124
+ ## See Also
125
+
126
+ - [Orchestrators](orchestrators.md) - How middleware is used in orchestration
127
+ - [API Reference - Orchestrators](../api/orchestrators.md) - API documentation
128
+ - [Contributing - Code Style](../contributing/code-style.md) - Development guidelines
129
+
130
+
131
+
132
+
133
+
134
+
135
+
136
+
137
+
138
+
139
+
140
+
141
+
142
+
docs/architecture/services.md ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Services Architecture
2
+
3
+ DeepCritical provides several services for embeddings, RAG, and statistical analysis.
4
+
5
+ ## Embedding Service
6
+
7
+ **File**: `src/services/embeddings.py`
8
+
9
+ **Purpose**: Local sentence-transformers for semantic search and deduplication
10
+
11
+ **Features**:
12
+ - **No API Key Required**: Uses local sentence-transformers models
13
+ - **Async-Safe**: All operations use `run_in_executor()` to avoid blocking
14
+ - **ChromaDB Storage**: Vector storage for embeddings
15
+ - **Deduplication**: 0.85 similarity threshold (85% similarity = duplicate)
16
+
17
+ **Model**: Configurable via `settings.local_embedding_model` (default: `all-MiniLM-L6-v2`)
18
+
19
+ **Methods**:
20
+ - `async def embed(text: str) -> list[float]`: Generate embeddings
21
+ - `async def embed_batch(texts: list[str]) -> list[list[float]]`: Batch embedding
22
+ - `async def similarity(text1: str, text2: str) -> float`: Calculate similarity
23
+ - `async def find_duplicates(texts: list[str], threshold: float = 0.85) -> list[tuple[int, int]]`: Find duplicates
24
+
25
+ **Usage**:
26
+ ```python
27
+ from src.services.embeddings import get_embedding_service
28
+
29
+ service = get_embedding_service()
30
+ embedding = await service.embed("text to embed")
31
+ ```
32
+
33
+ ## LlamaIndex RAG Service
34
+
35
+ **File**: `src/services/rag.py`
36
+
37
+ **Purpose**: Retrieval-Augmented Generation using LlamaIndex
38
+
39
+ **Features**:
40
+ - **OpenAI Embeddings**: Requires `OPENAI_API_KEY`
41
+ - **ChromaDB Storage**: Vector database for document storage
42
+ - **Metadata Preservation**: Preserves source, title, URL, date, authors
43
+ - **Lazy Initialization**: Graceful fallback if OpenAI key not available
44
+
45
+ **Methods**:
46
+ - `async def ingest_evidence(evidence: list[Evidence]) -> None`: Ingest evidence into RAG
47
+ - `async def retrieve(query: str, top_k: int = 5) -> list[Document]`: Retrieve relevant documents
48
+ - `async def query(query: str, top_k: int = 5) -> str`: Query with RAG
49
+
50
+ **Usage**:
51
+ ```python
52
+ from src.services.rag import get_rag_service
53
+
54
+ service = get_rag_service()
55
+ if service:
56
+ documents = await service.retrieve("query", top_k=5)
57
+ ```
58
+
59
+ ## Statistical Analyzer
60
+
61
+ **File**: `src/services/statistical_analyzer.py`
62
+
63
+ **Purpose**: Secure execution of AI-generated statistical code
64
+
65
+ **Features**:
66
+ - **Modal Sandbox**: Secure, isolated execution environment
67
+ - **Code Generation**: Generates Python code via LLM
68
+ - **Library Pinning**: Version-pinned libraries in `SANDBOX_LIBRARIES`
69
+ - **Network Isolation**: `block_network=True` by default
70
+
71
+ **Libraries Available**:
72
+ - pandas, numpy, scipy
73
+ - matplotlib, scikit-learn
74
+ - statsmodels
75
+
76
+ **Output**: `AnalysisResult` with:
77
+ - `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
78
+ - `code`: Generated analysis code
79
+ - `output`: Execution output
80
+ - `error`: Error message if execution failed
81
+
82
+ **Usage**:
83
+ ```python
84
+ from src.services.statistical_analyzer import StatisticalAnalyzer
85
+
86
+ analyzer = StatisticalAnalyzer()
87
+ result = await analyzer.analyze(
88
+ hypothesis="Metformin reduces cancer risk",
89
+ evidence=evidence_list
90
+ )
91
+ ```
92
+
93
+ ## Singleton Pattern
94
+
95
+ All services use the singleton pattern with `@lru_cache(maxsize=1)`:
96
+
97
+ ```python
98
+ @lru_cache(maxsize=1)
99
+ def get_embedding_service() -> EmbeddingService:
100
+ return EmbeddingService()
101
+ ```
102
+
103
+ This ensures:
104
+ - Single instance per process
105
+ - Lazy initialization
106
+ - No dependencies required at import time
107
+
108
+ ## Service Availability
109
+
110
+ Services check availability before use:
111
+
112
+ ```python
113
+ from src.utils.config import settings
114
+
115
+ if settings.modal_available:
116
+ # Use Modal sandbox
117
+ pass
118
+
119
+ if settings.has_openai_key:
120
+ # Use OpenAI embeddings for RAG
121
+ pass
122
+ ```
123
+
124
+ ## See Also
125
+
126
+ - [Tools](tools.md) - How services are used by search tools
127
+ - [API Reference - Services](../api/services.md) - API documentation
128
+ - [Configuration](../configuration/index.md) - Service configuration
129
+
130
+
131
+
132
+
133
+
134
+
135
+
136
+
137
+
138
+
139
+
140
+
141
+
142
+
docs/architecture/tools.md ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Tools Architecture
2
+
3
+ DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources.
4
+
5
+ ## SearchTool Protocol
6
+
7
+ All tools implement the `SearchTool` protocol from `src/tools/base.py`:
8
+
9
+ ```python
10
+ class SearchTool(Protocol):
11
+ @property
12
+ def name(self) -> str: ...
13
+
14
+ async def search(
15
+ self,
16
+ query: str,
17
+ max_results: int = 10
18
+ ) -> list[Evidence]: ...
19
+ ```
20
+
21
+ ## Rate Limiting
22
+
23
+ All tools use the `@retry` decorator from tenacity:
24
+
25
+ ```python
26
+ @retry(
27
+ stop=stop_after_attempt(3),
28
+ wait=wait_exponential(...)
29
+ )
30
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
31
+ # Implementation
32
+ ```
33
+
34
+ Tools with API rate limits implement `_rate_limit()` method and use shared rate limiters from `src/tools/rate_limiter.py`.
35
+
36
+ ## Error Handling
37
+
38
+ Tools raise custom exceptions:
39
+
40
+ - `SearchError`: General search failures
41
+ - `RateLimitError`: Rate limit exceeded
42
+
43
+ Tools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs).
44
+
45
+ ## Query Preprocessing
46
+
47
+ Tools use `preprocess_query()` from `src/tools/query_utils.py` to:
48
+
49
+ - Remove noise from queries
50
+ - Expand synonyms
51
+ - Normalize query format
52
+
53
+ ## Evidence Conversion
54
+
55
+ All tools convert API responses to `Evidence` objects with:
56
+
57
+ - `Citation`: Title, URL, date, authors
58
+ - `content`: Evidence text
59
+ - `relevance_score`: 0.0-1.0 relevance score
60
+ - `metadata`: Additional metadata
61
+
62
+ Missing fields are handled gracefully with defaults.
63
+
64
+ ## Tool Implementations
65
+
66
+ ### PubMed Tool
67
+
68
+ **File**: `src/tools/pubmed.py`
69
+
70
+ **API**: NCBI E-utilities (ESearch → EFetch)
71
+
72
+ **Rate Limiting**:
73
+ - 0.34s between requests (3 req/sec without API key)
74
+ - 0.1s between requests (10 req/sec with NCBI API key)
75
+
76
+ **Features**:
77
+ - XML parsing with `xmltodict`
78
+ - Handles single vs. multiple articles
79
+ - Query preprocessing
80
+ - Evidence conversion with metadata extraction
81
+
82
+ ### ClinicalTrials Tool
83
+
84
+ **File**: `src/tools/clinicaltrials.py`
85
+
86
+ **API**: ClinicalTrials.gov API v2
87
+
88
+ **Important**: Uses `requests` library (NOT httpx) because WAF blocks httpx TLS fingerprint.
89
+
90
+ **Execution**: Runs in thread pool: `await asyncio.to_thread(requests.get, ...)`
91
+
92
+ **Filtering**:
93
+ - Only interventional studies
94
+ - Status: `COMPLETED`, `ACTIVE_NOT_RECRUITING`, `RECRUITING`, `ENROLLING_BY_INVITATION`
95
+
96
+ **Features**:
97
+ - Parses nested JSON structure
98
+ - Extracts trial metadata
99
+ - Evidence conversion
100
+
101
+ ### Europe PMC Tool
102
+
103
+ **File**: `src/tools/europepmc.py`
104
+
105
+ **API**: Europe PMC REST API
106
+
107
+ **Features**:
108
+ - Handles preprint markers: `[PREPRINT - Not peer-reviewed]`
109
+ - Builds URLs from DOI or PMID
110
+ - Checks `pubTypeList` for preprint detection
111
+ - Includes both preprints and peer-reviewed articles
112
+
113
+ ### RAG Tool
114
+
115
+ **File**: `src/tools/rag_tool.py`
116
+
117
+ **Purpose**: Semantic search within collected evidence
118
+
119
+ **Implementation**: Wraps `LlamaIndexRAGService`
120
+
121
+ **Features**:
122
+ - Returns Evidence from RAG results
123
+ - Handles evidence ingestion
124
+ - Semantic similarity search
125
+ - Metadata preservation
126
+
127
+ ### Search Handler
128
+
129
+ **File**: `src/tools/search_handler.py`
130
+
131
+ **Purpose**: Orchestrates parallel searches across multiple tools
132
+
133
+ **Features**:
134
+ - Uses `asyncio.gather()` with `return_exceptions=True`
135
+ - Aggregates results into `SearchResult`
136
+ - Handles tool failures gracefully
137
+ - Deduplicates results by URL
138
+
139
+ ## Tool Registration
140
+
141
+ Tools are registered in the search handler:
142
+
143
+ ```python
144
+ from src.tools.pubmed import PubMedTool
145
+ from src.tools.clinicaltrials import ClinicalTrialsTool
146
+ from src.tools.europepmc import EuropePMCTool
147
+
148
+ search_handler = SearchHandler(
149
+ tools=[
150
+ PubMedTool(),
151
+ ClinicalTrialsTool(),
152
+ EuropePMCTool(),
153
+ ]
154
+ )
155
+ ```
156
+
157
+ ## See Also
158
+
159
+ - [Services](services.md) - RAG and embedding services
160
+ - [API Reference - Tools](../api/tools.md) - API documentation
161
+ - [Contributing - Implementation Patterns](../contributing/implementation-patterns.md) - Development guidelines
162
+
163
+
164
+
165
+
166
+
167
+
168
+
169
+
170
+
171
+
172
+
173
+
174
+
175
+
docs/contributing/code-quality.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Code Quality & Documentation
2
+
3
+ This document outlines code quality standards and documentation requirements.
4
+
5
+ ## Linting
6
+
7
+ - Ruff with 100-char line length
8
+ - Ignore rules documented in `pyproject.toml`:
9
+ - `PLR0913`: Too many arguments (agents need many params)
10
+ - `PLR0912`: Too many branches (complex orchestrator logic)
11
+ - `PLR0911`: Too many return statements (complex agent logic)
12
+ - `PLR2004`: Magic values (statistical constants)
13
+ - `PLW0603`: Global statement (singleton pattern)
14
+ - `PLC0415`: Lazy imports for optional dependencies
15
+
16
+ ## Type Checking
17
+
18
+ - `mypy --strict` compliance
19
+ - `ignore_missing_imports = true` (for optional dependencies)
20
+ - Exclude: `reference_repos/`, `examples/`
21
+ - All functions must have complete type annotations
22
+
23
+ ## Pre-commit
24
+
25
+ - Run `make check` before committing
26
+ - Must pass: lint + typecheck + test-cov
27
+ - Pre-commit hooks installed via `make install`
28
+
29
+ ## Documentation
30
+
31
+ ### Docstrings
32
+
33
+ - Google-style docstrings for all public functions
34
+ - Include Args, Returns, Raises sections
35
+ - Use type hints in docstrings only if needed for clarity
36
+
37
+ Example:
38
+
39
+ ```python
40
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
41
+ """Search PubMed and return evidence.
42
+
43
+ Args:
44
+ query: The search query string
45
+ max_results: Maximum number of results to return
46
+
47
+ Returns:
48
+ List of Evidence objects
49
+
50
+ Raises:
51
+ SearchError: If the search fails
52
+ RateLimitError: If we hit rate limits
53
+ """
54
+ ```
55
+
56
+ ### Code Comments
57
+
58
+ - Explain WHY, not WHAT
59
+ - Document non-obvious patterns (e.g., why `requests` not `httpx` for ClinicalTrials)
60
+ - Mark critical sections: `# CRITICAL: ...`
61
+ - Document rate limiting rationale
62
+ - Explain async patterns when non-obvious
63
+
64
+ ## See Also
65
+
66
+ - [Code Style](code-style.md) - Code style guidelines
67
+ - [Testing](testing.md) - Testing guidelines
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+
80
+
81
+
docs/contributing/code-style.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Code Style & Conventions
2
+
3
+ This document outlines the code style and conventions for DeepCritical.
4
+
5
+ ## Type Safety
6
+
7
+ - **ALWAYS** use type hints for all function parameters and return types
8
+ - Use `mypy --strict` compliance (no `Any` unless absolutely necessary)
9
+ - Use `TYPE_CHECKING` imports for circular dependencies:
10
+
11
+ ```python
12
+ from typing import TYPE_CHECKING
13
+ if TYPE_CHECKING:
14
+ from src.services.embeddings import EmbeddingService
15
+ ```
16
+
17
+ ## Pydantic Models
18
+
19
+ - All data exchange uses Pydantic models (`src/utils/models.py`)
20
+ - Models are frozen (`model_config = {"frozen": True}`) for immutability
21
+ - Use `Field()` with descriptions for all model fields
22
+ - Validate with `ge=`, `le=`, `min_length=`, `max_length=` constraints
23
+
24
+ ## Async Patterns
25
+
26
+ - **ALL** I/O operations must be async (`async def`, `await`)
27
+ - Use `asyncio.gather()` for parallel operations
28
+ - CPU-bound work (embeddings, parsing) must use `run_in_executor()`:
29
+
30
+ ```python
31
+ loop = asyncio.get_running_loop()
32
+ result = await loop.run_in_executor(None, cpu_bound_function, args)
33
+ ```
34
+
35
+ - Never block the event loop with synchronous I/O
36
+
37
+ ## Common Pitfalls
38
+
39
+ 1. **Blocking the event loop**: Never use sync I/O in async functions
40
+ 2. **Missing type hints**: All functions must have complete type annotations
41
+ 3. **Global mutable state**: Use ContextVar or pass via parameters
42
+ 4. **Import errors**: Lazy-load optional dependencies (magentic, modal, embeddings)
43
+
44
+ ## See Also
45
+
46
+ - [Error Handling](error-handling.md) - Error handling guidelines
47
+ - [Implementation Patterns](implementation-patterns.md) - Common patterns
48
+
49
+
50
+
51
+
52
+
53
+
54
+
55
+
56
+
57
+
58
+
59
+
60
+
61
+
docs/contributing/error-handling.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Error Handling & Logging
2
+
3
+ This document outlines error handling and logging conventions for DeepCritical.
4
+
5
+ ## Exception Hierarchy
6
+
7
+ Use custom exception hierarchy (`src/utils/exceptions.py`):
8
+
9
+ - `DeepCriticalError` (base)
10
+ - `SearchError` → `RateLimitError`
11
+ - `JudgeError`
12
+ - `ConfigurationError`
13
+
14
+ ## Error Handling Rules
15
+
16
+ - Always chain exceptions: `raise SearchError(...) from e`
17
+ - Log errors with context using `structlog`:
18
+
19
+ ```python
20
+ logger.error("Operation failed", error=str(e), context=value)
21
+ ```
22
+
23
+ - Never silently swallow exceptions
24
+ - Provide actionable error messages
25
+
26
+ ## Logging
27
+
28
+ - Use `structlog` for all logging (NOT `print` or `logging`)
29
+ - Import: `import structlog; logger = structlog.get_logger()`
30
+ - Log with structured data: `logger.info("event", key=value)`
31
+ - Use appropriate levels: DEBUG, INFO, WARNING, ERROR
32
+
33
+ ## Logging Examples
34
+
35
+ ```python
36
+ logger.info("Starting search", query=query, tools=[t.name for t in tools])
37
+ logger.warning("Search tool failed", tool=tool.name, error=str(result))
38
+ logger.error("Assessment failed", error=str(e))
39
+ ```
40
+
41
+ ## Error Chaining
42
+
43
+ Always preserve exception context:
44
+
45
+ ```python
46
+ try:
47
+ result = await api_call()
48
+ except httpx.HTTPError as e:
49
+ raise SearchError(f"API call failed: {e}") from e
50
+ ```
51
+
52
+ ## See Also
53
+
54
+ - [Code Style](code-style.md) - Code style guidelines
55
+ - [Testing](testing.md) - Testing guidelines
56
+
57
+
58
+
59
+
60
+
61
+
62
+
63
+
64
+
65
+
66
+
67
+
68
+
69
+
docs/contributing/implementation-patterns.md ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Implementation Patterns
2
+
3
+ This document outlines common implementation patterns used in DeepCritical.
4
+
5
+ ## Search Tools
6
+
7
+ All tools implement `SearchTool` protocol (`src/tools/base.py`):
8
+
9
+ - Must have `name` property
10
+ - Must implement `async def search(query, max_results) -> list[Evidence]`
11
+ - Use `@retry` decorator from tenacity for resilience
12
+ - Rate limiting: Implement `_rate_limit()` for APIs with limits (e.g., PubMed)
13
+ - Error handling: Raise `SearchError` or `RateLimitError` on failures
14
+
15
+ Example pattern:
16
+
17
+ ```python
18
+ class MySearchTool:
19
+ @property
20
+ def name(self) -> str:
21
+ return "mytool"
22
+
23
+ @retry(stop=stop_after_attempt(3), wait=wait_exponential(...))
24
+ async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
25
+ # Implementation
26
+ return evidence_list
27
+ ```
28
+
29
+ ## Judge Handlers
30
+
31
+ - Implement `JudgeHandlerProtocol` (`async def assess(question, evidence) -> JudgeAssessment`)
32
+ - Use pydantic-ai `Agent` with `output_type=JudgeAssessment`
33
+ - System prompts in `src/prompts/judge.py`
34
+ - Support fallback handlers: `MockJudgeHandler`, `HFInferenceJudgeHandler`
35
+ - Always return valid `JudgeAssessment` (never raise exceptions)
36
+
37
+ ## Agent Factory Pattern
38
+
39
+ - Use factory functions for creating agents (`src/agent_factory/`)
40
+ - Lazy initialization for optional dependencies (e.g., embeddings, Modal)
41
+ - Check requirements before initialization:
42
+
43
+ ```python
44
+ def check_magentic_requirements() -> None:
45
+ if not settings.has_openai_key:
46
+ raise ConfigurationError("Magentic requires OpenAI")
47
+ ```
48
+
49
+ ## State Management
50
+
51
+ - **Magentic Mode**: Use `ContextVar` for thread-safe state (`src/agents/state.py`)
52
+ - **Simple Mode**: Pass state via function parameters
53
+ - Never use global mutable state (except singletons via `@lru_cache`)
54
+
55
+ ## Singleton Pattern
56
+
57
+ Use `@lru_cache(maxsize=1)` for singletons:
58
+
59
+ ```python
60
+ @lru_cache(maxsize=1)
61
+ def get_embedding_service() -> EmbeddingService:
62
+ return EmbeddingService()
63
+ ```
64
+
65
+ - Lazy initialization to avoid requiring dependencies at import time
66
+
67
+ ## See Also
68
+
69
+ - [Code Style](code-style.md) - Code style guidelines
70
+ - [Error Handling](error-handling.md) - Error handling guidelines
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+
80
+
81
+
82
+
83
+
84
+
docs/contributing/index.md ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Contributing to DeepCritical
2
+
3
+ Thank you for your interest in contributing to DeepCritical! This guide will help you get started.
4
+
5
+ ## Git Workflow
6
+
7
+ - `main`: Production-ready (GitHub)
8
+ - `dev`: Development integration (GitHub)
9
+ - Use feature branches: `yourname-dev`
10
+ - **NEVER** push directly to `main` or `dev` on HuggingFace
11
+ - GitHub is source of truth; HuggingFace is for deployment
12
+
13
+ ## Development Commands
14
+
15
+ ```bash
16
+ make install # Install dependencies + pre-commit
17
+ make check # Lint + typecheck + test (MUST PASS)
18
+ make test # Run unit tests
19
+ make lint # Run ruff
20
+ make format # Format with ruff
21
+ make typecheck # Run mypy
22
+ make test-cov # Test with coverage
23
+ ```
24
+
25
+ ## Getting Started
26
+
27
+ 1. **Fork the repository** on GitHub
28
+ 2. **Clone your fork**:
29
+ ```bash
30
+ git clone https://github.com/yourusername/GradioDemo.git
31
+ cd GradioDemo
32
+ ```
33
+ 3. **Install dependencies**:
34
+ ```bash
35
+ make install
36
+ ```
37
+ 4. **Create a feature branch**:
38
+ ```bash
39
+ git checkout -b yourname-feature-name
40
+ ```
41
+ 5. **Make your changes** following the guidelines below
42
+ 6. **Run checks**:
43
+ ```bash
44
+ make check
45
+ ```
46
+ 7. **Commit and push**:
47
+ ```bash
48
+ git commit -m "Description of changes"
49
+ git push origin yourname-feature-name
50
+ ```
51
+ 8. **Create a pull request** on GitHub
52
+
53
+ ## Development Guidelines
54
+
55
+ ### Code Style
56
+
57
+ - Follow [Code Style Guidelines](code-style.md)
58
+ - All code must pass `mypy --strict`
59
+ - Use `ruff` for linting and formatting
60
+ - Line length: 100 characters
61
+
62
+ ### Error Handling
63
+
64
+ - Follow [Error Handling Guidelines](error-handling.md)
65
+ - Always chain exceptions: `raise SearchError(...) from e`
66
+ - Use structured logging with `structlog`
67
+ - Never silently swallow exceptions
68
+
69
+ ### Testing
70
+
71
+ - Follow [Testing Guidelines](testing.md)
72
+ - Write tests before implementation (TDD)
73
+ - Aim for >80% coverage on critical paths
74
+ - Use markers: `unit`, `integration`, `slow`
75
+
76
+ ### Implementation Patterns
77
+
78
+ - Follow [Implementation Patterns](implementation-patterns.md)
79
+ - Use factory functions for agent/tool creation
80
+ - Implement protocols for extensibility
81
+ - Use singleton pattern with `@lru_cache(maxsize=1)`
82
+
83
+ ### Prompt Engineering
84
+
85
+ - Follow [Prompt Engineering Guidelines](prompt-engineering.md)
86
+ - Always validate citations
87
+ - Use diverse evidence selection
88
+ - Never trust LLM-generated citations without validation
89
+
90
+ ### Code Quality
91
+
92
+ - Follow [Code Quality Guidelines](code-quality.md)
93
+ - Google-style docstrings for all public functions
94
+ - Explain WHY, not WHAT in comments
95
+ - Mark critical sections: `# CRITICAL: ...`
96
+
97
+ ## MCP Integration
98
+
99
+ ### MCP Tools
100
+
101
+ - Functions in `src/mcp_tools.py` for Claude Desktop
102
+ - Full type hints required
103
+ - Google-style docstrings with Args/Returns sections
104
+ - Formatted string returns (markdown)
105
+
106
+ ### Gradio MCP Server
107
+
108
+ - Enable with `mcp_server=True` in `demo.launch()`
109
+ - Endpoint: `/gradio_api/mcp/`
110
+ - Use `ssr_mode=False` to fix hydration issues in HF Spaces
111
+
112
+ ## Common Pitfalls
113
+
114
+ 1. **Blocking the event loop**: Never use sync I/O in async functions
115
+ 2. **Missing type hints**: All functions must have complete type annotations
116
+ 3. **Hallucinated citations**: Always validate references
117
+ 4. **Global mutable state**: Use ContextVar or pass via parameters
118
+ 5. **Import errors**: Lazy-load optional dependencies (magentic, modal, embeddings)
119
+ 6. **Rate limiting**: Always implement for external APIs
120
+ 7. **Error chaining**: Always use `from e` when raising exceptions
121
+
122
+ ## Key Principles
123
+
124
+ 1. **Type Safety First**: All code must pass `mypy --strict`
125
+ 2. **Async Everything**: All I/O must be async
126
+ 3. **Test-Driven**: Write tests before implementation
127
+ 4. **No Hallucinations**: Validate all citations
128
+ 5. **Graceful Degradation**: Support free tier (HF Inference) when no API keys
129
+ 6. **Lazy Loading**: Don't require optional dependencies at import time
130
+ 7. **Structured Logging**: Use structlog, never print()
131
+ 8. **Error Chaining**: Always preserve exception context
132
+
133
+ ## Pull Request Process
134
+
135
+ 1. Ensure all checks pass: `make check`
136
+ 2. Update documentation if needed
137
+ 3. Add tests for new features
138
+ 4. Update CHANGELOG if applicable
139
+ 5. Request review from maintainers
140
+ 6. Address review feedback
141
+ 7. Wait for approval before merging
142
+
143
+ ## Questions?
144
+
145
+ - Open an issue on GitHub
146
+ - Check existing documentation
147
+ - Review code examples in the codebase
148
+
149
+ Thank you for contributing to DeepCritical!
150
+
151
+
152
+
153
+
154
+
155
+
156
+
157
+
158
+
159
+
160
+
161
+
162
+
163
+
docs/contributing/prompt-engineering.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Prompt Engineering & Citation Validation
2
+
3
+ This document outlines prompt engineering guidelines and citation validation rules.
4
+
5
+ ## Judge Prompts
6
+
7
+ - System prompt in `src/prompts/judge.py`
8
+ - Format evidence with truncation (1500 chars per item)
9
+ - Handle empty evidence case separately
10
+ - Always request structured JSON output
11
+ - Use `format_user_prompt()` and `format_empty_evidence_prompt()` helpers
12
+
13
+ ## Hypothesis Prompts
14
+
15
+ - Use diverse evidence selection (MMR algorithm)
16
+ - Sentence-aware truncation (`truncate_at_sentence()`)
17
+ - Format: Drug → Target → Pathway → Effect
18
+ - System prompt emphasizes mechanistic reasoning
19
+ - Use `format_hypothesis_prompt()` with embeddings for diversity
20
+
21
+ ## Report Prompts
22
+
23
+ - Include full citation details for validation
24
+ - Use diverse evidence selection (n=20)
25
+ - **CRITICAL**: Emphasize citation validation rules
26
+ - Format hypotheses with support/contradiction counts
27
+ - System prompt includes explicit JSON structure requirements
28
+
29
+ ## Citation Validation
30
+
31
+ - **ALWAYS** validate references before returning reports
32
+ - Use `validate_references()` from `src/utils/citation_validator.py`
33
+ - Remove hallucinated citations (URLs not in evidence)
34
+ - Log warnings for removed citations
35
+ - Never trust LLM-generated citations without validation
36
+
37
+ ## Citation Validation Rules
38
+
39
+ 1. Every reference URL must EXACTLY match a provided evidence URL
40
+ 2. Do NOT invent, fabricate, or hallucinate any references
41
+ 3. Do NOT modify paper titles, authors, dates, or URLs
42
+ 4. If unsure about a citation, OMIT it rather than guess
43
+ 5. Copy URLs exactly as provided - do not create similar-looking URLs
44
+
45
+ ## Evidence Selection
46
+
47
+ - Use `select_diverse_evidence()` for MMR-based selection
48
+ - Balance relevance vs diversity (lambda=0.7 default)
49
+ - Sentence-aware truncation preserves meaning
50
+ - Limit evidence per prompt to avoid context overflow
51
+
52
+ ## See Also
53
+
54
+ - [Code Quality](code-quality.md) - Code quality guidelines
55
+ - [Error Handling](error-handling.md) - Error handling guidelines
56
+
57
+
58
+
59
+
60
+
61
+
62
+
63
+
64
+
65
+
66
+
67
+
68
+
69
+
docs/contributing/testing.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Testing Requirements
2
+
3
+ This document outlines testing requirements and guidelines for DeepCritical.
4
+
5
+ ## Test Structure
6
+
7
+ - Unit tests in `tests/unit/` (mocked, fast)
8
+ - Integration tests in `tests/integration/` (real APIs, marked `@pytest.mark.integration`)
9
+ - Use markers: `unit`, `integration`, `slow`
10
+
11
+ ## Mocking
12
+
13
+ - Use `respx` for httpx mocking
14
+ - Use `pytest-mock` for general mocking
15
+ - Mock LLM calls in unit tests (use `MockJudgeHandler`)
16
+ - Fixtures in `tests/conftest.py`: `mock_httpx_client`, `mock_llm_response`
17
+
18
+ ## TDD Workflow
19
+
20
+ 1. Write failing test in `tests/unit/`
21
+ 2. Implement in `src/`
22
+ 3. Ensure test passes
23
+ 4. Run `make check` (lint + typecheck + test)
24
+
25
+ ## Test Examples
26
+
27
+ ```python
28
+ @pytest.mark.unit
29
+ async def test_pubmed_search(mock_httpx_client):
30
+ tool = PubMedTool()
31
+ results = await tool.search("metformin", max_results=5)
32
+ assert len(results) > 0
33
+ assert all(isinstance(r, Evidence) for r in results)
34
+
35
+ @pytest.mark.integration
36
+ async def test_real_pubmed_search():
37
+ tool = PubMedTool()
38
+ results = await tool.search("metformin", max_results=3)
39
+ assert len(results) <= 3
40
+ ```
41
+
42
+ ## Test Coverage
43
+
44
+ - Run `make test-cov` for coverage report
45
+ - Aim for >80% coverage on critical paths
46
+ - Exclude: `__init__.py`, `TYPE_CHECKING` blocks
47
+
48
+ ## See Also
49
+
50
+ - [Code Style](code-style.md) - Code style guidelines
51
+ - [Implementation Patterns](implementation-patterns.md) - Common patterns
52
+
53
+
54
+
55
+
56
+
57
+
58
+
59
+
60
+
61
+
62
+
63
+
64
+
65
+
docs/getting-started/examples.md ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Examples
2
+
3
+ This page provides examples of using DeepCritical for various research tasks.
4
+
5
+ ## Basic Research Query
6
+
7
+ ### Example 1: Drug Information
8
+
9
+ **Query**:
10
+ ```
11
+ What are the latest treatments for Alzheimer's disease?
12
+ ```
13
+
14
+ **What DeepCritical Does**:
15
+ 1. Searches PubMed for recent papers
16
+ 2. Searches ClinicalTrials.gov for active trials
17
+ 3. Evaluates evidence quality
18
+ 4. Synthesizes findings into a comprehensive report
19
+
20
+ ### Example 2: Clinical Trial Search
21
+
22
+ **Query**:
23
+ ```
24
+ What clinical trials are investigating metformin for cancer prevention?
25
+ ```
26
+
27
+ **What DeepCritical Does**:
28
+ 1. Searches ClinicalTrials.gov for relevant trials
29
+ 2. Searches PubMed for supporting literature
30
+ 3. Provides trial details and status
31
+ 4. Summarizes findings
32
+
33
+ ## Advanced Research Queries
34
+
35
+ ### Example 3: Comprehensive Review
36
+
37
+ **Query**:
38
+ ```
39
+ Review the evidence for using metformin as an anti-aging intervention,
40
+ including clinical trials, mechanisms of action, and safety profile.
41
+ ```
42
+
43
+ **What DeepCritical Does**:
44
+ 1. Uses deep research mode (multi-section)
45
+ 2. Searches multiple sources in parallel
46
+ 3. Generates sections on:
47
+ - Clinical trials
48
+ - Mechanisms of action
49
+ - Safety profile
50
+ 4. Synthesizes comprehensive report
51
+
52
+ ### Example 4: Hypothesis Testing
53
+
54
+ **Query**:
55
+ ```
56
+ Test the hypothesis that regular exercise reduces Alzheimer's disease risk.
57
+ ```
58
+
59
+ **What DeepCritical Does**:
60
+ 1. Generates testable hypotheses
61
+ 2. Searches for supporting/contradicting evidence
62
+ 3. Performs statistical analysis (if Modal configured)
63
+ 4. Provides verdict: SUPPORTED, REFUTED, or INCONCLUSIVE
64
+
65
+ ## MCP Tool Examples
66
+
67
+ ### Using search_pubmed
68
+
69
+ ```
70
+ Search PubMed for "CRISPR gene editing cancer therapy"
71
+ ```
72
+
73
+ ### Using search_clinical_trials
74
+
75
+ ```
76
+ Find active clinical trials for "diabetes type 2 treatment"
77
+ ```
78
+
79
+ ### Using search_all
80
+
81
+ ```
82
+ Search all sources for "COVID-19 vaccine side effects"
83
+ ```
84
+
85
+ ### Using analyze_hypothesis
86
+
87
+ ```
88
+ Analyze whether vitamin D supplementation reduces COVID-19 severity
89
+ ```
90
+
91
+ ## Code Examples
92
+
93
+ ### Python API Usage
94
+
95
+ ```python
96
+ from src.orchestrator_factory import create_orchestrator
97
+ from src.tools.search_handler import SearchHandler
98
+ from src.agent_factory.judges import create_judge_handler
99
+
100
+ # Create orchestrator
101
+ search_handler = SearchHandler()
102
+ judge_handler = create_judge_handler()
103
+ orchestrator = create_orchestrator(
104
+ search_handler=search_handler,
105
+ judge_handler=judge_handler,
106
+ config={},
107
+ mode="advanced"
108
+ )
109
+
110
+ # Run research query
111
+ query = "What are the latest treatments for Alzheimer's disease?"
112
+ async for event in orchestrator.run(query):
113
+ print(f"Event: {event.type} - {event.data}")
114
+ ```
115
+
116
+ ### Gradio UI Integration
117
+
118
+ ```python
119
+ import gradio as gr
120
+ from src.app import create_research_interface
121
+
122
+ # Create interface
123
+ interface = create_research_interface()
124
+
125
+ # Launch
126
+ interface.launch(server_name="0.0.0.0", server_port=7860)
127
+ ```
128
+
129
+ ## Research Patterns
130
+
131
+ ### Iterative Research
132
+
133
+ Single-loop research with search-judge-synthesize cycles:
134
+
135
+ ```python
136
+ from src.orchestrator.research_flow import IterativeResearchFlow
137
+
138
+ flow = IterativeResearchFlow(
139
+ search_handler=search_handler,
140
+ judge_handler=judge_handler,
141
+ use_graph=False
142
+ )
143
+
144
+ async for event in flow.run(query):
145
+ # Handle events
146
+ pass
147
+ ```
148
+
149
+ ### Deep Research
150
+
151
+ Multi-section parallel research:
152
+
153
+ ```python
154
+ from src.orchestrator.research_flow import DeepResearchFlow
155
+
156
+ flow = DeepResearchFlow(
157
+ search_handler=search_handler,
158
+ judge_handler=judge_handler,
159
+ use_graph=True
160
+ )
161
+
162
+ async for event in flow.run(query):
163
+ # Handle events
164
+ pass
165
+ ```
166
+
167
+ ## Configuration Examples
168
+
169
+ ### Basic Configuration
170
+
171
+ ```bash
172
+ # .env file
173
+ LLM_PROVIDER=openai
174
+ OPENAI_API_KEY=your_key_here
175
+ MAX_ITERATIONS=10
176
+ ```
177
+
178
+ ### Advanced Configuration
179
+
180
+ ```bash
181
+ # .env file
182
+ LLM_PROVIDER=anthropic
183
+ ANTHROPIC_API_KEY=your_key_here
184
+ EMBEDDING_PROVIDER=local
185
+ WEB_SEARCH_PROVIDER=duckduckgo
186
+ MAX_ITERATIONS=20
187
+ DEFAULT_TOKEN_LIMIT=200000
188
+ USE_GRAPH_EXECUTION=true
189
+ ```
190
+
191
+ ## Next Steps
192
+
193
+ - Read the [Configuration Guide](../configuration/index.md) for all options
194
+ - Explore the [Architecture Documentation](../architecture/graph-orchestration.md)
195
+ - Check out the [API Reference](../api/agents.md) for programmatic usage
196
+
197
+
198
+
199
+
200
+
201
+
202
+
203
+
204
+
205
+
206
+
207
+
208
+
209
+
docs/getting-started/installation.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Installation
2
+
3
+ This guide will help you install and set up DeepCritical on your system.
4
+
5
+ ## Prerequisites
6
+
7
+ - Python 3.11 or higher
8
+ - `uv` package manager (recommended) or `pip`
9
+ - At least one LLM API key (OpenAI, Anthropic, or HuggingFace)
10
+
11
+ ## Installation Steps
12
+
13
+ ### 1. Install uv (Recommended)
14
+
15
+ `uv` is a fast Python package installer and resolver. Install it with:
16
+
17
+ ```bash
18
+ pip install uv
19
+ ```
20
+
21
+ ### 2. Clone the Repository
22
+
23
+ ```bash
24
+ git clone https://github.com/DeepCritical/GradioDemo.git
25
+ cd GradioDemo
26
+ ```
27
+
28
+ ### 3. Install Dependencies
29
+
30
+ Using `uv` (recommended):
31
+
32
+ ```bash
33
+ uv sync
34
+ ```
35
+
36
+ Using `pip`:
37
+
38
+ ```bash
39
+ pip install -e .
40
+ ```
41
+
42
+ ### 4. Install Optional Dependencies
43
+
44
+ For embeddings support (local sentence-transformers):
45
+
46
+ ```bash
47
+ uv sync --extra embeddings
48
+ ```
49
+
50
+ For Modal sandbox execution:
51
+
52
+ ```bash
53
+ uv sync --extra modal
54
+ ```
55
+
56
+ For Magentic orchestration:
57
+
58
+ ```bash
59
+ uv sync --extra magentic
60
+ ```
61
+
62
+ Install all extras:
63
+
64
+ ```bash
65
+ uv sync --all-extras
66
+ ```
67
+
68
+ ### 5. Configure Environment Variables
69
+
70
+ Create a `.env` file in the project root:
71
+
72
+ ```bash
73
+ # Required: At least one LLM provider
74
+ LLM_PROVIDER=openai # or "anthropic" or "huggingface"
75
+ OPENAI_API_KEY=your_openai_api_key_here
76
+
77
+ # Optional: Other services
78
+ NCBI_API_KEY=your_ncbi_api_key_here # For higher PubMed rate limits
79
+ MODAL_TOKEN_ID=your_modal_token_id
80
+ MODAL_TOKEN_SECRET=your_modal_token_secret
81
+ ```
82
+
83
+ See the [Configuration Guide](../configuration/index.md) for all available options.
84
+
85
+ ### 6. Verify Installation
86
+
87
+ Run the application:
88
+
89
+ ```bash
90
+ uv run gradio run src/app.py
91
+ ```
92
+
93
+ Open your browser to `http://localhost:7860` to verify the installation.
94
+
95
+ ## Development Setup
96
+
97
+ For development, install dev dependencies:
98
+
99
+ ```bash
100
+ uv sync --all-extras --dev
101
+ ```
102
+
103
+ Install pre-commit hooks:
104
+
105
+ ```bash
106
+ uv run pre-commit install
107
+ ```
108
+
109
+ ## Troubleshooting
110
+
111
+ ### Common Issues
112
+
113
+ **Import Errors**:
114
+ - Ensure you've installed all required dependencies
115
+ - Check that Python 3.11+ is being used
116
+
117
+ **API Key Errors**:
118
+ - Verify your `.env` file is in the project root
119
+ - Check that API keys are correctly formatted
120
+ - Ensure at least one LLM provider is configured
121
+
122
+ **Module Not Found**:
123
+ - Run `uv sync` or `pip install -e .` again
124
+ - Check that you're in the correct virtual environment
125
+
126
+ **Port Already in Use**:
127
+ - Change the port in `src/app.py` or use environment variable
128
+ - Kill the process using port 7860
129
+
130
+ ## Next Steps
131
+
132
+ - Read the [Quick Start Guide](quick-start.md)
133
+ - Learn about [MCP Integration](mcp-integration.md)
134
+ - Explore [Examples](examples.md)
135
+
136
+
137
+
138
+
139
+
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+
148
+
docs/getting-started/mcp-integration.md ADDED
@@ -0,0 +1,215 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MCP Integration
2
+
3
+ DeepCritical exposes a Model Context Protocol (MCP) server, allowing you to use its search tools directly from Claude Desktop or other MCP clients.
4
+
5
+ ## What is MCP?
6
+
7
+ The Model Context Protocol (MCP) is a standard for connecting AI assistants to external tools and data sources. DeepCritical implements an MCP server that exposes its search capabilities as MCP tools.
8
+
9
+ ## MCP Server URL
10
+
11
+ When running locally:
12
+
13
+ ```
14
+ http://localhost:7860/gradio_api/mcp/
15
+ ```
16
+
17
+ ## Claude Desktop Configuration
18
+
19
+ ### 1. Locate Configuration File
20
+
21
+ **macOS**:
22
+ ```
23
+ ~/Library/Application Support/Claude/claude_desktop_config.json
24
+ ```
25
+
26
+ **Windows**:
27
+ ```
28
+ %APPDATA%\Claude\claude_desktop_config.json
29
+ ```
30
+
31
+ **Linux**:
32
+ ```
33
+ ~/.config/Claude/claude_desktop_config.json
34
+ ```
35
+
36
+ ### 2. Add DeepCritical Server
37
+
38
+ Edit `claude_desktop_config.json` and add:
39
+
40
+ ```json
41
+ {
42
+ "mcpServers": {
43
+ "deepcritical": {
44
+ "url": "http://localhost:7860/gradio_api/mcp/"
45
+ }
46
+ }
47
+ }
48
+ ```
49
+
50
+ ### 3. Restart Claude Desktop
51
+
52
+ Close and restart Claude Desktop for changes to take effect.
53
+
54
+ ### 4. Verify Connection
55
+
56
+ In Claude Desktop, you should see DeepCritical tools available:
57
+ - `search_pubmed`
58
+ - `search_clinical_trials`
59
+ - `search_biorxiv`
60
+ - `search_all`
61
+ - `analyze_hypothesis`
62
+
63
+ ## Available Tools
64
+
65
+ ### search_pubmed
66
+
67
+ Search peer-reviewed biomedical literature from PubMed.
68
+
69
+ **Parameters**:
70
+ - `query` (string): Search query
71
+ - `max_results` (integer, optional): Maximum number of results (default: 10)
72
+
73
+ **Example**:
74
+ ```
75
+ Search PubMed for "metformin diabetes"
76
+ ```
77
+
78
+ ### search_clinical_trials
79
+
80
+ Search ClinicalTrials.gov for interventional studies.
81
+
82
+ **Parameters**:
83
+ - `query` (string): Search query
84
+ - `max_results` (integer, optional): Maximum number of results (default: 10)
85
+
86
+ **Example**:
87
+ ```
88
+ Search clinical trials for "Alzheimer's disease treatment"
89
+ ```
90
+
91
+ ### search_biorxiv
92
+
93
+ Search bioRxiv/medRxiv preprints via Europe PMC.
94
+
95
+ **Parameters**:
96
+ - `query` (string): Search query
97
+ - `max_results` (integer, optional): Maximum number of results (default: 10)
98
+
99
+ **Example**:
100
+ ```
101
+ Search bioRxiv for "CRISPR gene editing"
102
+ ```
103
+
104
+ ### search_all
105
+
106
+ Search all sources simultaneously (PubMed, ClinicalTrials.gov, Europe PMC).
107
+
108
+ **Parameters**:
109
+ - `query` (string): Search query
110
+ - `max_results` (integer, optional): Maximum number of results per source (default: 10)
111
+
112
+ **Example**:
113
+ ```
114
+ Search all sources for "COVID-19 vaccine efficacy"
115
+ ```
116
+
117
+ ### analyze_hypothesis
118
+
119
+ Perform secure statistical analysis using Modal sandboxes.
120
+
121
+ **Parameters**:
122
+ - `hypothesis` (string): Hypothesis to analyze
123
+ - `data` (string, optional): Data description or code
124
+
125
+ **Example**:
126
+ ```
127
+ Analyze the hypothesis that metformin reduces cancer risk
128
+ ```
129
+
130
+ ## Using Tools in Claude Desktop
131
+
132
+ Once configured, you can ask Claude to use DeepCritical tools:
133
+
134
+ ```
135
+ Use DeepCritical to search PubMed for recent papers on Alzheimer's disease treatments.
136
+ ```
137
+
138
+ Claude will automatically:
139
+ 1. Call the appropriate DeepCritical tool
140
+ 2. Retrieve results
141
+ 3. Use the results in its response
142
+
143
+ ## Troubleshooting
144
+
145
+ ### Connection Issues
146
+
147
+ **Server Not Found**:
148
+ - Ensure DeepCritical is running (`uv run gradio run src/app.py`)
149
+ - Verify the URL in `claude_desktop_config.json` is correct
150
+ - Check that port 7860 is not blocked by firewall
151
+
152
+ **Tools Not Appearing**:
153
+ - Restart Claude Desktop after configuration changes
154
+ - Check Claude Desktop logs for errors
155
+ - Verify MCP server is accessible at the configured URL
156
+
157
+ ### Authentication
158
+
159
+ If DeepCritical requires authentication:
160
+ - Configure API keys in DeepCritical settings
161
+ - Use HuggingFace OAuth login
162
+ - Ensure API keys are valid
163
+
164
+ ## Advanced Configuration
165
+
166
+ ### Custom Port
167
+
168
+ If running on a different port, update the URL:
169
+
170
+ ```json
171
+ {
172
+ "mcpServers": {
173
+ "deepcritical": {
174
+ "url": "http://localhost:8080/gradio_api/mcp/"
175
+ }
176
+ }
177
+ }
178
+ ```
179
+
180
+ ### Multiple Instances
181
+
182
+ You can configure multiple DeepCritical instances:
183
+
184
+ ```json
185
+ {
186
+ "mcpServers": {
187
+ "deepcritical-local": {
188
+ "url": "http://localhost:7860/gradio_api/mcp/"
189
+ },
190
+ "deepcritical-remote": {
191
+ "url": "https://your-server.com/gradio_api/mcp/"
192
+ }
193
+ }
194
+ }
195
+ ```
196
+
197
+ ## Next Steps
198
+
199
+ - Learn about [Configuration](../configuration/index.md) for advanced settings
200
+ - Explore [Examples](examples.md) for use cases
201
+ - Read the [Architecture Documentation](../architecture/graph-orchestration.md)
202
+
203
+
204
+
205
+
206
+
207
+
208
+
209
+
210
+
211
+
212
+
213
+
214
+
215
+
docs/getting-started/quick-start.md ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Quick Start Guide
2
+
3
+ Get up and running with DeepCritical in minutes.
4
+
5
+ ## Start the Application
6
+
7
+ ```bash
8
+ uv run gradio run src/app.py
9
+ ```
10
+
11
+ Open your browser to `http://localhost:7860`.
12
+
13
+ ## First Research Query
14
+
15
+ 1. **Enter a Research Question**
16
+
17
+ Type your research question in the chat interface, for example:
18
+ - "What are the latest treatments for Alzheimer's disease?"
19
+ - "Review the evidence for metformin in cancer prevention"
20
+ - "What clinical trials are investigating COVID-19 vaccines?"
21
+
22
+ 2. **Submit the Query**
23
+
24
+ Click "Submit" or press Enter. The system will:
25
+ - Generate observations about your query
26
+ - Identify knowledge gaps
27
+ - Search multiple sources (PubMed, ClinicalTrials.gov, Europe PMC)
28
+ - Evaluate evidence quality
29
+ - Synthesize findings into a report
30
+
31
+ 3. **Review Results**
32
+
33
+ Watch the real-time progress in the chat interface:
34
+ - Search operations and results
35
+ - Evidence evaluation
36
+ - Report generation
37
+ - Final research report with citations
38
+
39
+ ## Authentication
40
+
41
+ ### HuggingFace OAuth (Recommended)
42
+
43
+ 1. Click "Sign in with HuggingFace" at the top of the app
44
+ 2. Authorize the application
45
+ 3. Your HuggingFace API token will be automatically used
46
+ 4. No need to manually enter API keys
47
+
48
+ ### Manual API Key
49
+
50
+ 1. Open the Settings accordion
51
+ 2. Enter your API key:
52
+ - OpenAI API key
53
+ - Anthropic API key
54
+ - HuggingFace API key
55
+ 3. Click "Save Settings"
56
+ 4. Manual keys take priority over OAuth tokens
57
+
58
+ ## Understanding the Interface
59
+
60
+ ### Chat Interface
61
+
62
+ - **Input**: Enter your research questions here
63
+ - **Messages**: View conversation history and research progress
64
+ - **Streaming**: Real-time updates as research progresses
65
+
66
+ ### Status Indicators
67
+
68
+ - **Searching**: Active search operations
69
+ - **Evaluating**: Evidence quality assessment
70
+ - **Synthesizing**: Report generation
71
+ - **Complete**: Research finished
72
+
73
+ ### Settings
74
+
75
+ - **API Keys**: Configure LLM providers
76
+ - **Research Mode**: Choose iterative or deep research
77
+ - **Budget Limits**: Set token, time, and iteration limits
78
+
79
+ ## Example Queries
80
+
81
+ ### Simple Query
82
+
83
+ ```
84
+ What are the side effects of metformin?
85
+ ```
86
+
87
+ ### Complex Query
88
+
89
+ ```
90
+ Review the evidence for using metformin as an anti-aging intervention,
91
+ including clinical trials, mechanisms of action, and safety profile.
92
+ ```
93
+
94
+ ### Clinical Trial Query
95
+
96
+ ```
97
+ What are the active clinical trials investigating Alzheimer's disease treatments?
98
+ ```
99
+
100
+ ## Next Steps
101
+
102
+ - Learn about [MCP Integration](mcp-integration.md) to use DeepCritical from Claude Desktop
103
+ - Explore [Examples](examples.md) for more use cases
104
+ - Read the [Configuration Guide](../configuration/index.md) for advanced settings
105
+ - Check out the [Architecture Documentation](../architecture/graph-orchestration.md) to understand how it works
106
+
107
+
108
+
109
+
110
+
111
+
112
+
113
+
114
+
115
+
116
+
117
+
118
+
119
+
docs/license.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # License
2
+
3
+ DeepCritical is licensed under the MIT License.
4
+
5
+ ## MIT License
6
+
7
+ Copyright (c) 2024 DeepCritical Team
8
+
9
+ Permission is hereby granted, free of charge, to any person obtaining a copy
10
+ of this software and associated documentation files (the "Software"), to deal
11
+ in the Software without restriction, including without limitation the rights
12
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
13
+ copies of the Software, and to permit persons to whom the Software is
14
+ furnished to do so, subject to the following conditions:
15
+
16
+ The above copyright notice and this permission notice shall be included in all
17
+ copies or substantial portions of the Software.
18
+
19
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
21
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
22
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
23
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
25
+ SOFTWARE.
26
+
27
+
28
+
29
+
30
+
31
+
32
+
33
+
34
+
35
+
36
+
37
+
38
+
39
+
docs/overview/architecture.md ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Architecture Overview
2
+
3
+ DeepCritical is a deep research agent system that uses iterative search-and-judge loops to comprehensively answer research questions. The system supports multiple orchestration patterns, graph-based execution, parallel research workflows, and long-running task management with real-time streaming.
4
+
5
+ ## Core Architecture
6
+
7
+ ### Orchestration Patterns
8
+
9
+ 1. **Graph Orchestrator** (`src/orchestrator/graph_orchestrator.py`):
10
+ - Graph-based execution using Pydantic AI agents as nodes
11
+ - Supports both iterative and deep research patterns
12
+ - Node types: Agent, State, Decision, Parallel
13
+ - Edge types: Sequential, Conditional, Parallel
14
+ - Conditional routing based on knowledge gaps, budget, and iterations
15
+ - Parallel execution for concurrent research loops
16
+ - Event streaming via `AsyncGenerator[AgentEvent]` for real-time UI updates
17
+ - Fallback to agent chains when graph execution is disabled
18
+
19
+ 2. **Deep Research Flow** (`src/orchestrator/research_flow.py`):
20
+ - **Pattern**: Planner → Parallel Iterative Loops (one per section) → Synthesis
21
+ - Uses `PlannerAgent` to break query into report sections
22
+ - Runs `IterativeResearchFlow` instances in parallel per section via `WorkflowManager`
23
+ - Synthesizes results using `LongWriterAgent` or `ProofreaderAgent`
24
+ - Supports both graph execution (`use_graph=True`) and agent chains (`use_graph=False`)
25
+ - Budget tracking per section and globally
26
+ - State synchronization across parallel loops
27
+
28
+ 3. **Iterative Research Flow** (`src/orchestrator/research_flow.py`):
29
+ - **Pattern**: Generate observations → Evaluate gaps → Select tools → Execute → Judge → Continue/Complete
30
+ - Uses `KnowledgeGapAgent`, `ToolSelectorAgent`, `ThinkingAgent`, `WriterAgent`
31
+ - `JudgeHandler` assesses evidence sufficiency
32
+ - Iterates until research complete or constraints met (iterations, time, tokens)
33
+ - Supports graph execution and agent chains
34
+
35
+ 4. **Magentic Orchestrator** (`src/orchestrator_magentic.py`):
36
+ - Multi-agent coordination using `agent-framework-core`
37
+ - ChatAgent pattern with internal LLMs per agent
38
+ - Uses `MagenticBuilder` with participants: searcher, hypothesizer, judge, reporter
39
+ - Manager orchestrates agents via `OpenAIChatClient`
40
+ - Requires OpenAI API key (function calling support)
41
+ - Event-driven: converts Magentic events to `AgentEvent` for UI streaming
42
+ - Supports long-running workflows with max rounds and stall/reset handling
43
+
44
+ 5. **Hierarchical Orchestrator** (`src/orchestrator_hierarchical.py`):
45
+ - Uses `SubIterationMiddleware` with `ResearchTeam` and `LLMSubIterationJudge`
46
+ - Adapts Magentic ChatAgent to `SubIterationTeam` protocol
47
+ - Event-driven via `asyncio.Queue` for coordination
48
+ - Supports sub-iteration patterns for complex research tasks
49
+
50
+ 6. **Legacy Simple Mode** (`src/legacy_orchestrator.py`):
51
+ - Linear search-judge-synthesize loop
52
+ - Uses `SearchHandlerProtocol` and `JudgeHandlerProtocol`
53
+ - Generator-based design yielding `AgentEvent` objects
54
+ - Backward compatibility for simple use cases
55
+
56
+ ## Long-Running Task Support
57
+
58
+ The system is designed for long-running research tasks with comprehensive state management and streaming:
59
+
60
+ 1. **Event Streaming**:
61
+ - All orchestrators yield `AgentEvent` objects via `AsyncGenerator`
62
+ - Real-time UI updates through Gradio chat interface
63
+ - Event types: `started`, `searching`, `search_complete`, `judging`, `judge_complete`, `looping`, `synthesizing`, `hypothesizing`, `complete`, `error`
64
+ - Metadata includes iteration numbers, tool names, result counts, durations
65
+
66
+ 2. **Budget Tracking** (`src/middleware/budget_tracker.py`):
67
+ - Per-loop and global budget management
68
+ - Tracks: tokens, time (seconds), iterations
69
+ - Budget enforcement at decision nodes
70
+ - Token estimation (~4 chars per token)
71
+ - Early termination when budgets exceeded
72
+ - Budget summaries for monitoring
73
+
74
+ 3. **Workflow Manager** (`src/middleware/workflow_manager.py`):
75
+ - Coordinates parallel research loops
76
+ - Tracks loop status: `pending`, `running`, `completed`, `failed`, `cancelled`
77
+ - Synchronizes evidence between loops and global state
78
+ - Handles errors per loop (doesn't fail all if one fails)
79
+ - Supports loop cancellation and timeout handling
80
+ - Evidence deduplication across parallel loops
81
+
82
+ 4. **State Management** (`src/middleware/state_machine.py`):
83
+ - Thread-safe isolation using `ContextVar` for concurrent requests
84
+ - `WorkflowState` tracks: evidence, conversation history, embedding service
85
+ - Evidence deduplication by URL
86
+ - Semantic search via embedding service
87
+ - State persistence across long-running workflows
88
+ - Supports both iterative and deep research patterns
89
+
90
+ 5. **Gradio UI** (`src/app.py`):
91
+ - Real-time streaming of research progress
92
+ - Accordion-based UI for pending/done operations
93
+ - OAuth integration (HuggingFace)
94
+ - Multiple backend support (API keys, free tier)
95
+ - Handles long-running tasks with progress indicators
96
+ - Event accumulation for pending operations
97
+
98
+ ## Graph Architecture
99
+
100
+ The graph orchestrator (`src/orchestrator/graph_orchestrator.py`) implements a flexible graph-based execution model:
101
+
102
+ **Node Types**:
103
+
104
+ - **Agent Nodes**: Execute Pydantic AI agents (e.g., `KnowledgeGapAgent`, `ToolSelectorAgent`)
105
+ - **State Nodes**: Update or read workflow state (evidence, conversation)
106
+ - **Decision Nodes**: Make routing decisions (research complete?, budget exceeded?)
107
+ - **Parallel Nodes**: Execute multiple nodes concurrently (parallel research loops)
108
+
109
+ **Edge Types**:
110
+
111
+ - **Sequential Edges**: Always traversed (no condition)
112
+ - **Conditional Edges**: Traversed based on condition (e.g., if research complete → writer, else → tool selector)
113
+ - **Parallel Edges**: Used for parallel execution branches
114
+
115
+ **Graph Patterns**:
116
+
117
+ - **Iterative Graph**: `[Input] → [Thinking] → [Knowledge Gap] → [Decision: Complete?] → [Tool Selector] or [Writer]`
118
+ - **Deep Research Graph**: `[Input] → [Planner] → [Parallel Iterative Loops] → [Synthesizer]`
119
+
120
+ **Execution Flow**:
121
+
122
+ 1. Graph construction from nodes and edges
123
+ 2. Graph validation (no cycles, all nodes reachable)
124
+ 3. Graph execution from entry node
125
+ 4. Node execution based on type
126
+ 5. Edge evaluation for next node(s)
127
+ 6. Parallel execution via `asyncio.gather()`
128
+ 7. State updates at state nodes
129
+ 8. Event streaming for UI
130
+
131
+ ## Key Components
132
+
133
+ - **Orchestrators**: Multiple orchestration patterns (`src/orchestrator/`, `src/orchestrator_*.py`)
134
+ - **Research Flows**: Iterative and deep research patterns (`src/orchestrator/research_flow.py`)
135
+ - **Graph Builder**: Graph construction utilities (`src/agent_factory/graph_builder.py`)
136
+ - **Agents**: Pydantic AI agents (`src/agents/`, `src/agent_factory/agents.py`)
137
+ - **Search Tools**: PubMed, ClinicalTrials.gov, Europe PMC, RAG (`src/tools/`)
138
+ - **Judge Handler**: LLM-based evidence assessment (`src/agent_factory/judges.py`)
139
+ - **Embeddings**: Semantic search & deduplication (`src/services/embeddings.py`)
140
+ - **Statistical Analyzer**: Modal sandbox execution (`src/services/statistical_analyzer.py`)
141
+ - **Middleware**: State management, budget tracking, workflow coordination (`src/middleware/`)
142
+ - **MCP Tools**: Claude Desktop integration (`src/mcp_tools.py`)
143
+ - **Gradio UI**: Web interface with MCP server and streaming (`src/app.py`)
144
+
145
+ ## Research Team & Parallel Execution
146
+
147
+ The system supports complex research workflows through:
148
+
149
+ 1. **WorkflowManager**: Coordinates multiple parallel research loops
150
+ - Creates and tracks `ResearchLoop` instances
151
+ - Runs loops in parallel via `asyncio.gather()`
152
+ - Synchronizes evidence to global state
153
+ - Handles loop failures gracefully
154
+
155
+ 2. **Deep Research Pattern**: Breaks complex queries into sections
156
+ - Planner creates report outline with sections
157
+ - Each section runs as independent iterative research loop
158
+ - Loops execute in parallel
159
+ - Evidence shared across loops via global state
160
+ - Final synthesis combines all section results
161
+
162
+ 3. **State Synchronization**: Thread-safe evidence sharing
163
+ - Evidence deduplication by URL
164
+ - Global state accessible to all loops
165
+ - Semantic search across all collected evidence
166
+ - Conversation history tracking per iteration
167
+
168
+ ## Configuration & Modes
169
+
170
+ - **Orchestrator Factory** (`src/orchestrator_factory.py`):
171
+ - Auto-detects mode: "advanced" if OpenAI key available, else "simple"
172
+ - Supports explicit mode selection: "simple", "magentic", "advanced"
173
+ - Lazy imports for optional dependencies
174
+
175
+ - **Research Modes**:
176
+ - `iterative`: Single research loop
177
+ - `deep`: Multi-section parallel research
178
+ - `auto`: Auto-detect based on query complexity
179
+
180
+ - **Execution Modes**:
181
+ - `use_graph=True`: Graph-based execution (parallel, conditional routing)
182
+ - `use_graph=False`: Agent chains (sequential, backward compatible)
183
+
184
+
185
+
186
+
187
+
188
+
189
+
190
+
191
+
192
+
193
+
194
+
195
+
196
+
docs/overview/features.md ADDED
@@ -0,0 +1,148 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Features
2
+
3
+ DeepCritical provides a comprehensive set of features for AI-assisted research:
4
+
5
+ ## Core Features
6
+
7
+ ### Multi-Source Search
8
+
9
+ - **PubMed**: Search peer-reviewed biomedical literature via NCBI E-utilities
10
+ - **ClinicalTrials.gov**: Search interventional clinical trials
11
+ - **Europe PMC**: Search preprints and peer-reviewed articles (includes bioRxiv/medRxiv)
12
+ - **RAG**: Semantic search within collected evidence using LlamaIndex
13
+
14
+ ### MCP Integration
15
+
16
+ - **Model Context Protocol**: Expose search tools via MCP server
17
+ - **Claude Desktop**: Use DeepCritical tools directly from Claude Desktop
18
+ - **MCP Clients**: Compatible with any MCP-compatible client
19
+
20
+ ### Authentication
21
+
22
+ - **HuggingFace OAuth**: Sign in with HuggingFace account for automatic API token usage
23
+ - **Manual API Keys**: Support for OpenAI, Anthropic, and HuggingFace API keys
24
+ - **Free Tier Support**: Automatic fallback to HuggingFace Inference API
25
+
26
+ ### Secure Code Execution
27
+
28
+ - **Modal Sandbox**: Secure execution of AI-generated statistical code
29
+ - **Isolated Environment**: Network isolation and package version pinning
30
+ - **Safe Execution**: Prevents malicious code execution
31
+
32
+ ### Semantic Search & RAG
33
+
34
+ - **LlamaIndex Integration**: Advanced RAG capabilities
35
+ - **Vector Storage**: ChromaDB for embedding storage
36
+ - **Semantic Deduplication**: Automatic detection of similar evidence
37
+ - **Embedding Service**: Local sentence-transformers (no API key required)
38
+
39
+ ### Orchestration Patterns
40
+
41
+ - **Graph-Based Execution**: Flexible graph orchestration with conditional routing
42
+ - **Parallel Research Loops**: Run multiple research tasks concurrently
43
+ - **Iterative Research**: Single-loop research with search-judge-synthesize cycles
44
+ - **Deep Research**: Multi-section parallel research with planning and synthesis
45
+ - **Magentic Orchestration**: Multi-agent coordination using Microsoft Agent Framework
46
+
47
+ ### Real-Time Streaming
48
+
49
+ - **Event Streaming**: Real-time updates via `AsyncGenerator[AgentEvent]`
50
+ - **Progress Tracking**: Monitor research progress with detailed event metadata
51
+ - **UI Integration**: Seamless integration with Gradio chat interface
52
+
53
+ ### Budget Management
54
+
55
+ - **Token Budget**: Track and limit LLM token usage
56
+ - **Time Budget**: Enforce time limits per research loop
57
+ - **Iteration Budget**: Limit maximum iterations
58
+ - **Per-Loop Budgets**: Independent budgets for parallel research loops
59
+
60
+ ### State Management
61
+
62
+ - **Thread-Safe Isolation**: ContextVar-based state management
63
+ - **Evidence Deduplication**: Automatic URL-based deduplication
64
+ - **Conversation History**: Track iteration history and agent interactions
65
+ - **State Synchronization**: Share evidence across parallel loops
66
+
67
+ ## Advanced Features
68
+
69
+ ### Agent System
70
+
71
+ - **Pydantic AI Agents**: Type-safe agent implementation
72
+ - **Structured Output**: Pydantic models for agent responses
73
+ - **Agent Factory**: Centralized agent creation with fallback support
74
+ - **Specialized Agents**: Knowledge gap, tool selector, writer, proofreader, and more
75
+
76
+ ### Search Tools
77
+
78
+ - **Rate Limiting**: Built-in rate limiting for external APIs
79
+ - **Retry Logic**: Automatic retry with exponential backoff
80
+ - **Query Preprocessing**: Automatic query enhancement and synonym expansion
81
+ - **Evidence Conversion**: Automatic conversion to structured Evidence objects
82
+
83
+ ### Error Handling
84
+
85
+ - **Custom Exceptions**: Hierarchical exception system
86
+ - **Error Chaining**: Preserve exception context
87
+ - **Structured Logging**: Comprehensive logging with structlog
88
+ - **Graceful Degradation**: Fallback handlers for missing dependencies
89
+
90
+ ### Configuration
91
+
92
+ - **Pydantic Settings**: Type-safe configuration management
93
+ - **Environment Variables**: Support for `.env` files
94
+ - **Validation**: Automatic configuration validation
95
+ - **Flexible Providers**: Support for multiple LLM and embedding providers
96
+
97
+ ### Testing
98
+
99
+ - **Unit Tests**: Comprehensive unit test coverage
100
+ - **Integration Tests**: Real API integration tests
101
+ - **Mock Support**: Extensive mocking utilities
102
+ - **Coverage Reports**: Code coverage tracking
103
+
104
+ ## UI Features
105
+
106
+ ### Gradio Interface
107
+
108
+ - **Real-Time Chat**: Interactive chat interface
109
+ - **Streaming Updates**: Live progress updates
110
+ - **Accordion UI**: Organized display of pending/done operations
111
+ - **OAuth Integration**: Seamless HuggingFace authentication
112
+
113
+ ### MCP Server
114
+
115
+ - **RESTful API**: HTTP-based MCP server
116
+ - **Tool Discovery**: Automatic tool registration
117
+ - **Request Handling**: Async request processing
118
+ - **Error Responses**: Structured error responses
119
+
120
+ ## Development Features
121
+
122
+ ### Code Quality
123
+
124
+ - **Type Safety**: Full type hints with mypy strict mode
125
+ - **Linting**: Ruff for code quality
126
+ - **Formatting**: Automatic code formatting
127
+ - **Pre-commit Hooks**: Automated quality checks
128
+
129
+ ### Documentation
130
+
131
+ - **Comprehensive Docs**: Detailed documentation for all components
132
+ - **Code Examples**: Extensive code examples
133
+ - **Architecture Diagrams**: Visual architecture documentation
134
+ - **API Reference**: Complete API documentation
135
+
136
+
137
+
138
+
139
+
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+
148
+
docs/team.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Team
2
+
3
+ DeepCritical is developed by a team of researchers and developers working on AI-assisted research.
4
+
5
+ ## Team Members
6
+
7
+ ### The-Obstacle-Is-The-Way
8
+
9
+ - GitHub: [The-Obstacle-Is-The-Way](https://github.com/The-Obstacle-Is-The-Way)
10
+
11
+ ### MarioAderman
12
+
13
+ - GitHub: [MarioAderman](https://github.com/MarioAderman)
14
+
15
+ ### Josephrp
16
+
17
+ - GitHub: [Josephrp](https://github.com/Josephrp)
18
+
19
+ ## About
20
+
21
+ The DeepCritical team met online in the Alzheimer's Critical Literature Review Group in the Hugging Science initiative. We're building the agent framework we want to use for AI-assisted research to turn the vast amounts of clinical data into cures.
22
+
23
+ ## Contributing
24
+
25
+ We welcome contributions! See the [Contributing Guide](contributing/index.md) for details.
26
+
27
+ ## Links
28
+
29
+ - [GitHub Repository](https://github.com/DeepCritical/GradioDemo)
30
+ - [HuggingFace Space](https://huggingface.co/spaces/DataQuests/DeepCritical)
31
+
32
+
33
+
34
+
35
+
36
+
37
+
38
+
39
+
40
+
41
+
42
+
43
+
44
+
src/app.py CHANGED
@@ -5,12 +5,24 @@ from collections.abc import AsyncGenerator
5
  from typing import Any
6
 
7
  import gradio as gr
8
- from pydantic_ai.models.anthropic import AnthropicModel
9
- from pydantic_ai.models.huggingface import HuggingFaceModel
10
- from pydantic_ai.models.openai import OpenAIChatModel as OpenAIModel
11
- from pydantic_ai.providers.anthropic import AnthropicProvider
12
- from pydantic_ai.providers.huggingface import HuggingFaceProvider
13
- from pydantic_ai.providers.openai import OpenAIProvider
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  from src.agent_factory.judges import HFInferenceJudgeHandler, JudgeHandler, MockJudgeHandler
16
  from src.orchestrator_factory import create_orchestrator
@@ -19,14 +31,15 @@ from src.tools.europepmc import EuropePMCTool
19
  from src.tools.pubmed import PubMedTool
20
  from src.tools.search_handler import SearchHandler
21
  from src.utils.config import settings
22
- from src.utils.models import OrchestratorConfig
23
 
24
 
25
  def configure_orchestrator(
26
  use_mock: bool = False,
27
  mode: str = "simple",
28
- user_api_key: str | None = None,
29
- api_provider: str = "huggingface",
 
30
  ) -> tuple[Any, str]:
31
  """
32
  Create an orchestrator instance.
@@ -34,8 +47,9 @@ def configure_orchestrator(
34
  Args:
35
  use_mock: If True, use MockJudgeHandler (no API key needed)
36
  mode: Orchestrator mode ("simple" or "advanced")
37
- user_api_key: Optional user-provided API key (BYOK)
38
- api_provider: API provider ("huggingface", "openai", or "anthropic")
 
39
 
40
  Returns:
41
  Tuple of (Orchestrator instance, backend_name)
@@ -61,46 +75,52 @@ def configure_orchestrator(
61
  judge_handler = MockJudgeHandler()
62
  backend_info = "Mock (Testing)"
63
 
64
- # 2. API Key (User provided or Env) - HuggingFace, OpenAI, or Anthropic
65
- elif (
66
- user_api_key
67
- or (
68
- api_provider == "huggingface"
69
- and (os.getenv("HF_TOKEN") or os.getenv("HUGGINGFACE_API_KEY"))
 
 
 
 
 
 
 
 
 
70
  )
71
- or (api_provider == "openai" and os.getenv("OPENAI_API_KEY"))
72
- or (api_provider == "anthropic" and os.getenv("ANTHROPIC_API_KEY"))
73
- ):
74
- model: AnthropicModel | HuggingFaceModel | OpenAIModel | None = None
75
- if user_api_key:
76
- # Validate key/provider match to prevent silent auth failures
77
- if api_provider == "openai" and user_api_key.startswith("sk-ant-"):
78
- raise ValueError("Anthropic key provided but OpenAI provider selected")
79
- is_openai_key = user_api_key.startswith("sk-") and not user_api_key.startswith(
80
- "sk-ant-"
81
  )
82
- if api_provider == "anthropic" and is_openai_key:
83
- raise ValueError("OpenAI key provided but Anthropic provider selected")
84
- if api_provider == "huggingface":
85
- model_name = settings.huggingface_model or "meta-llama/Llama-3.1-8B-Instruct"
86
- hf_provider = HuggingFaceProvider(api_key=user_api_key)
87
- model = HuggingFaceModel(model_name, provider=hf_provider)
88
- elif api_provider == "anthropic":
89
- anthropic_provider = AnthropicProvider(api_key=user_api_key)
90
- model = AnthropicModel(settings.anthropic_model, provider=anthropic_provider)
91
- elif api_provider == "openai":
92
- openai_provider = OpenAIProvider(api_key=user_api_key)
93
- model = OpenAIModel(settings.openai_model, provider=openai_provider)
94
- backend_info = f"API ({api_provider.upper()})"
95
- else:
96
- backend_info = "API (Env Config)"
97
 
98
  judge_handler = JudgeHandler(model=model)
99
 
100
- # 3. Free Tier (HuggingFace Inference)
101
  else:
102
- judge_handler = HFInferenceJudgeHandler()
103
- backend_info = "Free Tier (Llama 3.1 / Mistral)"
 
 
 
 
 
 
 
 
 
104
 
105
  orchestrator = create_orchestrator(
106
  search_handler=search_handler,
@@ -112,13 +132,289 @@ def configure_orchestrator(
112
  return orchestrator, backend_info
113
 
114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  async def research_agent(
116
  message: str,
117
  history: list[dict[str, Any]],
118
  mode: str = "simple",
119
- api_key: str = "",
120
- api_provider: str = "huggingface",
121
- ) -> AsyncGenerator[str, None]:
 
 
122
  """
123
  Gradio chat function that runs the research agent.
124
 
@@ -126,142 +422,205 @@ async def research_agent(
126
  message: User's research question
127
  history: Chat history (Gradio format)
128
  mode: Orchestrator mode ("simple" or "advanced")
129
- api_key: Optional user-provided API key (BYOK - Bring Your Own Key)
130
- api_provider: API provider ("huggingface", "openai", or "anthropic")
 
 
131
 
132
  Yields:
133
- Markdown-formatted responses for streaming
134
  """
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
  if not message.strip():
136
- yield "Please enter a research question."
 
 
 
137
  return
138
 
139
- # Clean user-provided API key
140
- user_api_key = api_key.strip() if api_key else None
141
-
142
- # Check available keys
143
- has_huggingface = bool(os.getenv("HF_TOKEN") or os.getenv("HUGGINGFACE_API_KEY"))
144
- has_openai = bool(os.getenv("OPENAI_API_KEY"))
145
- has_anthropic = bool(os.getenv("ANTHROPIC_API_KEY"))
146
- has_user_key = bool(user_api_key)
147
- has_paid_key = has_openai or has_anthropic or has_user_key
148
-
149
- # Advanced mode requires OpenAI specifically (due to agent-framework binding)
150
- if mode == "advanced" and not (has_openai or (has_user_key and api_provider == "openai")):
151
- yield (
152
- "⚠️ **Warning**: Advanced mode currently requires OpenAI API key. "
153
- "Falling back to simple mode.\n\n"
154
- )
155
- mode = "simple"
156
 
157
- # Inform user about their key being used
158
- if has_user_key:
159
- yield (
160
- f"🔑 **Using your {api_provider.upper()} API key** - "
161
- "Your key is used only for this session and is never stored.\n\n"
162
- )
163
- elif not has_paid_key and not has_huggingface:
164
- # No keys at all - will use FREE HuggingFace Inference (public models)
165
- yield (
166
- "🤗 **Free Tier**: Using HuggingFace Inference (Llama 3.1 / Mistral) for AI analysis.\n"
167
- "For premium models or higher rate limits, enter a HuggingFace, OpenAI, or Anthropic API key below.\n\n"
168
- )
169
 
170
- # Run the agent and stream events
171
- response_parts: list[str] = []
 
172
 
 
173
  try:
174
  # use_mock=False - let configure_orchestrator decide based on available keys
175
- # It will use: Paid API > HF Inference (free tier)
 
 
 
 
176
  orchestrator, backend_name = configure_orchestrator(
177
  use_mock=False, # Never use mock in production - HF Inference is the free fallback
178
- mode=mode,
179
- user_api_key=user_api_key,
180
- api_provider=api_provider,
 
181
  )
182
 
183
- yield f"🧠 **Backend**: {backend_name}\n\n"
184
-
185
- async for event in orchestrator.run(message):
186
- # Format event as markdown
187
- event_md = event.to_markdown()
188
- response_parts.append(event_md)
189
 
190
- # If complete, show full response
191
- if event.type == "complete":
192
- yield event.message
193
- else:
194
- # Show progress
195
- yield "\n\n".join(response_parts)
196
 
197
  except Exception as e:
198
- yield f"❌ **Error**: {e!s}"
199
-
200
-
201
- def create_demo() -> gr.ChatInterface:
 
 
 
 
 
 
 
 
202
  """
203
- Create the Gradio demo interface with MCP support.
204
 
205
  Returns:
206
- Configured Gradio Blocks interface with MCP server enabled
207
  """
208
- # 1. Unwrapped ChatInterface (Fixes Accordion Bug)
209
- demo = gr.ChatInterface(
210
- fn=research_agent,
211
- title="🧬 DeepCritical",
212
- description=(
213
- "*AI-Powered Drug Repurposing Agent — searches PubMed, "
214
- "ClinicalTrials.gov & Europe PMC*\n\n"
215
- "---\n"
216
- "*Research tool only — not for medical advice.* \n"
217
- "**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`"
218
- ),
219
- examples=[
220
- [
221
- "What drugs could be repurposed for Alzheimer's disease?",
222
- "simple",
223
- "",
224
- "openai",
225
- ],
226
- [
227
- "Is metformin effective for treating cancer?",
228
- "simple",
229
- "",
230
- "openai",
231
- ],
232
- [
233
- "What medications show promise for Long COVID treatment?",
234
- "simple",
235
- "",
236
- "openai",
237
- ],
238
- ],
239
- additional_inputs_accordion=gr.Accordion(label="⚙️ Settings", open=False),
240
- additional_inputs=[
241
- gr.Radio(
242
  choices=["simple", "advanced"],
243
  value="simple",
244
  label="Orchestrator Mode",
245
- info=(
246
- "Simple: Linear (Free Tier Friendly) | Advanced: Multi-Agent (Requires OpenAI)"
247
- ),
248
- ),
249
- gr.Textbox(
250
- label="🔑 API Key (Optional - BYOK)",
251
- placeholder="sk-... or sk-ant-...",
252
- type="password",
253
- info="Enter your own API key. Never stored.",
254
- ),
255
- gr.Radio(
256
- choices=["huggingface", "openai", "anthropic"],
257
- value="huggingface",
258
- label="API Provider",
259
- info="Select the provider for your API key (HuggingFace is default and free)",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
260
  ),
261
- ],
262
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
263
 
264
- return demo
265
 
266
 
267
  def main() -> None:
 
5
  from typing import Any
6
 
7
  import gradio as gr
8
+
9
+ # Try to import HuggingFace support (may not be available in all pydantic-ai versions)
10
+ # According to https://ai.pydantic.dev/models/huggingface/, HuggingFace support requires
11
+ # pydantic-ai with huggingface extra or pydantic-ai-slim[huggingface]
12
+ # There are two ways to use HuggingFace:
13
+ # 1. Inference API: HuggingFaceModel with HuggingFaceProvider (uses AsyncInferenceClient internally)
14
+ # 2. Local models: Would use transformers directly (not via pydantic-ai)
15
+ try:
16
+ from huggingface_hub import AsyncInferenceClient
17
+ from pydantic_ai.models.huggingface import HuggingFaceModel
18
+ from pydantic_ai.providers.huggingface import HuggingFaceProvider
19
+
20
+ _HUGGINGFACE_AVAILABLE = True
21
+ except ImportError:
22
+ HuggingFaceModel = None # type: ignore[assignment, misc]
23
+ HuggingFaceProvider = None # type: ignore[assignment, misc]
24
+ AsyncInferenceClient = None # type: ignore[assignment, misc]
25
+ _HUGGINGFACE_AVAILABLE = False
26
 
27
  from src.agent_factory.judges import HFInferenceJudgeHandler, JudgeHandler, MockJudgeHandler
28
  from src.orchestrator_factory import create_orchestrator
 
31
  from src.tools.pubmed import PubMedTool
32
  from src.tools.search_handler import SearchHandler
33
  from src.utils.config import settings
34
+ from src.utils.models import AgentEvent, OrchestratorConfig
35
 
36
 
37
  def configure_orchestrator(
38
  use_mock: bool = False,
39
  mode: str = "simple",
40
+ oauth_token: str | None = None,
41
+ hf_model: str | None = None,
42
+ hf_provider: str | None = None,
43
  ) -> tuple[Any, str]:
44
  """
45
  Create an orchestrator instance.
 
47
  Args:
48
  use_mock: If True, use MockJudgeHandler (no API key needed)
49
  mode: Orchestrator mode ("simple" or "advanced")
50
+ oauth_token: Optional OAuth token from HuggingFace login
51
+ hf_model: Selected HuggingFace model ID
52
+ hf_provider: Selected inference provider
53
 
54
  Returns:
55
  Tuple of (Orchestrator instance, backend_name)
 
75
  judge_handler = MockJudgeHandler()
76
  backend_info = "Mock (Testing)"
77
 
78
+ # 2. API Key (OAuth or Env) - HuggingFace only (OAuth provides HF token)
79
+ # Priority: oauth_token > env vars
80
+ # On HuggingFace Spaces, OAuth token is available via request.oauth_token
81
+ effective_api_key = oauth_token or os.getenv("HF_TOKEN") or os.getenv("HUGGINGFACE_API_KEY")
82
+
83
+ if effective_api_key:
84
+ # We have an API key (OAuth or env) - use pydantic-ai with JudgeHandler
85
+ # This uses HuggingFace's own inference API, not third-party providers
86
+ model: Any | None = None
87
+ # Use selected model or fall back to env var/settings
88
+ model_name = (
89
+ hf_model
90
+ or os.getenv("HF_MODEL")
91
+ or settings.huggingface_model
92
+ or "Qwen/Qwen3-Next-80B-A3B-Thinking"
93
  )
94
+ if not _HUGGINGFACE_AVAILABLE:
95
+ raise ImportError(
96
+ "HuggingFace models are not available in this version of pydantic-ai. "
97
+ "Please install with: uv add 'pydantic-ai[huggingface]' or use 'openai'/'anthropic' as the LLM provider."
 
 
 
 
 
 
98
  )
99
+ # Inference API - uses HuggingFace Inference API via AsyncInferenceClient
100
+ # Per https://ai.pydantic.dev/models/huggingface/#configure-the-provider
101
+ # Create AsyncInferenceClient for inference API
102
+ # AsyncInferenceClient accepts 'token' parameter for API key
103
+ hf_client = AsyncInferenceClient(token=effective_api_key) # type: ignore[misc]
104
+ # Pass client to HuggingFaceProvider for inference API usage
105
+ provider = HuggingFaceProvider(hf_client=hf_client) # type: ignore[misc]
106
+ model = HuggingFaceModel(model_name, provider=provider) # type: ignore[misc]
107
+ backend_info = "API (HuggingFace OAuth)" if oauth_token else "API (Env Config)"
 
 
 
 
 
 
108
 
109
  judge_handler = JudgeHandler(model=model)
110
 
111
+ # 3. Free Tier (HuggingFace Inference) - NO API KEY AVAILABLE
112
  else:
113
+ # No API key available - use HFInferenceJudgeHandler with public models
114
+ # Don't use third-party providers (novita, groq, etc.) as they require their own API keys
115
+ # Use HuggingFace's own inference API with public/ungated models
116
+ # Pass empty provider to use HuggingFace's default (not third-party providers)
117
+ judge_handler = HFInferenceJudgeHandler(
118
+ model_id=hf_model,
119
+ api_key=None, # No API key - will use public models only
120
+ provider=None, # Don't specify provider - use HuggingFace's default
121
+ )
122
+ model_display = hf_model.split("/")[-1] if hf_model else "Default (Public Models)"
123
+ backend_info = f"Free Tier ({model_display} - Public Models Only)"
124
 
125
  orchestrator = create_orchestrator(
126
  search_handler=search_handler,
 
132
  return orchestrator, backend_info
133
 
134
 
135
+ def event_to_chat_message(event: AgentEvent) -> dict[str, Any]:
136
+ """
137
+ Convert AgentEvent to gr.ChatMessage with metadata for accordion display.
138
+
139
+ Args:
140
+ event: The AgentEvent to convert
141
+
142
+ Returns:
143
+ ChatMessage with metadata for collapsible accordion
144
+ """
145
+ # Map event types to accordion titles and determine if pending
146
+ event_configs: dict[str, dict[str, Any]] = {
147
+ "started": {"title": "🚀 Starting Research", "status": "done", "icon": "🚀"},
148
+ "searching": {"title": "🔍 Searching Literature", "status": "pending", "icon": "🔍"},
149
+ "search_complete": {"title": "📚 Search Results", "status": "done", "icon": "📚"},
150
+ "judging": {"title": "🧠 Evaluating Evidence", "status": "pending", "icon": "🧠"},
151
+ "judge_complete": {"title": "✅ Evidence Assessment", "status": "done", "icon": "✅"},
152
+ "looping": {"title": "🔄 Research Iteration", "status": "pending", "icon": "🔄"},
153
+ "synthesizing": {"title": "📝 Synthesizing Report", "status": "pending", "icon": "📝"},
154
+ "hypothesizing": {"title": "🔬 Generating Hypothesis", "status": "pending", "icon": "🔬"},
155
+ "analyzing": {"title": "📊 Statistical Analysis", "status": "pending", "icon": "📊"},
156
+ "analysis_complete": {"title": "📈 Analysis Results", "status": "done", "icon": "📈"},
157
+ "streaming": {"title": "📡 Processing", "status": "pending", "icon": "📡"},
158
+ "complete": {"title": None, "status": "done", "icon": "🎉"}, # Main response, no accordion
159
+ "error": {"title": "❌ Error", "status": "done", "icon": "❌"},
160
+ }
161
+
162
+ config = event_configs.get(
163
+ event.type, {"title": f"• {event.type}", "status": "done", "icon": "•"}
164
+ )
165
+
166
+ # For complete events, return main response without accordion
167
+ if event.type == "complete":
168
+ # Return as dict format for Gradio Chatbot compatibility
169
+ return {
170
+ "role": "assistant",
171
+ "content": event.message,
172
+ }
173
+
174
+ # Build metadata for accordion according to Gradio ChatMessage spec
175
+ # Metadata keys: title (str), status ("pending"|"done"), log (str), duration (float)
176
+ # See: https://www.gradio.app/guides/agents-and-tool-usage
177
+ metadata: dict[str, Any] = {}
178
+
179
+ # Title is required for accordion display - must be string
180
+ if config["title"]:
181
+ metadata["title"] = str(config["title"])
182
+
183
+ # Set status (pending shows spinner, done is collapsed)
184
+ # Must be exactly "pending" or "done" per Gradio spec
185
+ if config["status"] == "pending":
186
+ metadata["status"] = "pending"
187
+ elif config["status"] == "done":
188
+ metadata["status"] = "done"
189
+
190
+ # Add duration if available in data (must be float)
191
+ if event.data and isinstance(event.data, dict) and "duration" in event.data:
192
+ duration = event.data["duration"]
193
+ if isinstance(duration, int | float):
194
+ metadata["duration"] = float(duration)
195
+
196
+ # Add log info (iteration number, etc.) - must be string
197
+ log_parts: list[str] = []
198
+ if event.iteration > 0:
199
+ log_parts.append(f"Iteration {event.iteration}")
200
+ if event.data and isinstance(event.data, dict):
201
+ if "tool" in event.data:
202
+ log_parts.append(f"Tool: {event.data['tool']}")
203
+ if "results_count" in event.data:
204
+ log_parts.append(f"Results: {event.data['results_count']}")
205
+ if log_parts:
206
+ metadata["log"] = " | ".join(log_parts)
207
+
208
+ # Return as dict format for Gradio Chatbot compatibility
209
+ # According to Gradio docs: https://www.gradio.app/guides/agents-and-tool-usage
210
+ # ChatMessage format: {"role": "assistant", "content": "...", "metadata": {...}}
211
+ # Metadata must have "title" key for accordion display
212
+ # Valid metadata keys: title (str), status ("pending"|"done"), log (str), duration (float)
213
+ result: dict[str, Any] = {
214
+ "role": "assistant",
215
+ "content": event.message,
216
+ }
217
+ # Only add metadata if it has a title (required for accordion display)
218
+ # Ensure metadata values match Gradio's expected types
219
+ if metadata and metadata.get("title"):
220
+ # Ensure status is valid if present
221
+ if "status" in metadata:
222
+ status = metadata["status"]
223
+ if status not in ("pending", "done"):
224
+ metadata["status"] = "done" # Default to "done" if invalid
225
+ result["metadata"] = metadata
226
+ return result
227
+
228
+
229
+ def extract_oauth_info(request: gr.Request | None) -> tuple[str | None, str | None]:
230
+ """
231
+ Extract OAuth token and username from Gradio request.
232
+
233
+ Args:
234
+ request: Gradio request object containing OAuth information
235
+
236
+ Returns:
237
+ Tuple of (oauth_token, oauth_username)
238
+ """
239
+ oauth_token: str | None = None
240
+ oauth_username: str | None = None
241
+
242
+ if request is None:
243
+ return oauth_token, oauth_username
244
+
245
+ # Try multiple ways to access OAuth token (Gradio API may vary)
246
+ # Pattern 1: request.oauth_token.token
247
+ if hasattr(request, "oauth_token") and request.oauth_token is not None:
248
+ if hasattr(request.oauth_token, "token"):
249
+ oauth_token = request.oauth_token.token
250
+ elif isinstance(request.oauth_token, str):
251
+ oauth_token = request.oauth_token
252
+ # Pattern 2: request.headers (fallback)
253
+ elif hasattr(request, "headers"):
254
+ # OAuth token might be in headers
255
+ auth_header = request.headers.get("authorization") or request.headers.get("Authorization")
256
+ if auth_header and auth_header.startswith("Bearer "):
257
+ oauth_token = auth_header.replace("Bearer ", "")
258
+
259
+ # Access username from request
260
+ if hasattr(request, "username") and request.username:
261
+ oauth_username = request.username
262
+ # Also try accessing via oauth_profile if available
263
+ elif hasattr(request, "oauth_profile") and request.oauth_profile is not None:
264
+ if hasattr(request.oauth_profile, "username"):
265
+ oauth_username = request.oauth_profile.username
266
+ elif hasattr(request.oauth_profile, "name"):
267
+ oauth_username = request.oauth_profile.name
268
+
269
+ return oauth_token, oauth_username
270
+
271
+
272
+ async def yield_auth_messages(
273
+ oauth_username: str | None,
274
+ oauth_token: str | None,
275
+ has_huggingface: bool,
276
+ mode: str,
277
+ ) -> AsyncGenerator[dict[str, Any], None]:
278
+ """
279
+ Yield authentication and mode status messages.
280
+
281
+ Args:
282
+ oauth_username: OAuth username if available
283
+ oauth_token: OAuth token if available
284
+ has_huggingface: Whether HuggingFace credentials are available
285
+ mode: Orchestrator mode
286
+
287
+ Yields:
288
+ ChatMessage objects with authentication status
289
+ """
290
+ # Show user greeting if logged in via OAuth
291
+ if oauth_username:
292
+ yield {
293
+ "role": "assistant",
294
+ "content": f"👋 **Welcome, {oauth_username}!** Using your HuggingFace account.\n\n",
295
+ }
296
+
297
+ # Advanced mode is not supported without OpenAI (which requires manual setup)
298
+ # For now, we only support simple mode with HuggingFace
299
+ if mode == "advanced":
300
+ yield {
301
+ "role": "assistant",
302
+ "content": (
303
+ "⚠️ **Warning**: Advanced mode requires OpenAI API key configuration. "
304
+ "Falling back to simple mode.\n\n"
305
+ ),
306
+ }
307
+
308
+ # Inform user about authentication status
309
+ if oauth_token:
310
+ yield {
311
+ "role": "assistant",
312
+ "content": (
313
+ "🔐 **Using HuggingFace OAuth token** - "
314
+ "Authenticated via your HuggingFace account.\n\n"
315
+ ),
316
+ }
317
+ elif not has_huggingface:
318
+ # No keys at all - will use FREE HuggingFace Inference (public models)
319
+ yield {
320
+ "role": "assistant",
321
+ "content": (
322
+ "🤗 **Free Tier**: Using HuggingFace Inference (Llama 3.1 / Mistral) for AI analysis.\n"
323
+ "For premium models or higher rate limits, sign in with HuggingFace above.\n\n"
324
+ ),
325
+ }
326
+
327
+
328
+ async def handle_orchestrator_events(
329
+ orchestrator: Any,
330
+ message: str,
331
+ ) -> AsyncGenerator[dict[str, Any], None]:
332
+ """
333
+ Handle orchestrator events and yield ChatMessages.
334
+
335
+ Args:
336
+ orchestrator: The orchestrator instance
337
+ message: The research question
338
+
339
+ Yields:
340
+ ChatMessage objects from orchestrator events
341
+ """
342
+ # Track pending accordions for real-time updates
343
+ pending_accordions: dict[str, str] = {} # title -> accumulated content
344
+
345
+ async for event in orchestrator.run(message):
346
+ # Convert event to ChatMessage with metadata
347
+ chat_msg = event_to_chat_message(event)
348
+
349
+ # Handle complete events (main response)
350
+ if event.type == "complete":
351
+ # Close any pending accordions first
352
+ if pending_accordions:
353
+ for title, content in pending_accordions.items():
354
+ yield {
355
+ "role": "assistant",
356
+ "content": content.strip(),
357
+ "metadata": {"title": title, "status": "done"},
358
+ }
359
+ pending_accordions.clear()
360
+
361
+ # Yield final response (no accordion for main response)
362
+ # chat_msg is already a dict from event_to_chat_message
363
+ yield chat_msg
364
+ continue
365
+
366
+ # Handle events with metadata (accordions)
367
+ # chat_msg is always a dict from event_to_chat_message
368
+ metadata: dict[str, Any] = chat_msg.get("metadata", {})
369
+ if metadata:
370
+ msg_title: str | None = metadata.get("title")
371
+ msg_status: str | None = metadata.get("status")
372
+
373
+ if msg_title:
374
+ # For pending operations, accumulate content and show spinner
375
+ if msg_status == "pending":
376
+ if msg_title not in pending_accordions:
377
+ pending_accordions[msg_title] = ""
378
+ # chat_msg is always a dict, so access content via key
379
+ content = chat_msg.get("content", "")
380
+ pending_accordions[msg_title] += content + "\n"
381
+ # Yield updated accordion with accumulated content
382
+ yield {
383
+ "role": "assistant",
384
+ "content": pending_accordions[msg_title].strip(),
385
+ "metadata": chat_msg.get("metadata", {}),
386
+ }
387
+ elif msg_title in pending_accordions:
388
+ # Combine pending content with final content
389
+ # chat_msg is always a dict, so access content via key
390
+ content = chat_msg.get("content", "")
391
+ final_content = pending_accordions[msg_title] + content
392
+ del pending_accordions[msg_title]
393
+ yield {
394
+ "role": "assistant",
395
+ "content": final_content.strip(),
396
+ "metadata": {"title": msg_title, "status": "done"},
397
+ }
398
+ else:
399
+ # New done accordion (no pending state)
400
+ yield chat_msg
401
+ else:
402
+ # No title, yield as-is
403
+ yield chat_msg
404
+ else:
405
+ # No metadata, yield as plain message
406
+ yield chat_msg
407
+
408
+
409
  async def research_agent(
410
  message: str,
411
  history: list[dict[str, Any]],
412
  mode: str = "simple",
413
+ hf_model: str | None = None,
414
+ hf_provider: str | None = None,
415
+ oauth_token: gr.OAuthToken | None = None,
416
+ oauth_profile: gr.OAuthProfile | None = None,
417
+ ) -> AsyncGenerator[dict[str, Any] | list[dict[str, Any]], None]:
418
  """
419
  Gradio chat function that runs the research agent.
420
 
 
422
  message: User's research question
423
  history: Chat history (Gradio format)
424
  mode: Orchestrator mode ("simple" or "advanced")
425
+ hf_model: Selected HuggingFace model ID (from dropdown)
426
+ hf_provider: Selected inference provider (from dropdown)
427
+ oauth_token: Gradio OAuth token (None if user not logged in)
428
+ oauth_profile: Gradio OAuth profile (None if user not logged in)
429
 
430
  Yields:
431
+ ChatMessage objects with metadata for accordion display
432
  """
433
+ # REQUIRE LOGIN BEFORE USE
434
+ # Extract OAuth token and username using Gradio's OAuth types
435
+ # According to Gradio docs: OAuthToken and OAuthProfile are None if user not logged in
436
+ token_value: str | None = None
437
+ username: str | None = None
438
+
439
+ if oauth_token is not None:
440
+ # OAuthToken has a .token attribute containing the access token
441
+ token_value = oauth_token.token if hasattr(oauth_token, "token") else None
442
+
443
+ if oauth_profile is not None:
444
+ # OAuthProfile has .username, .name, .profile_image attributes
445
+ username = (
446
+ oauth_profile.username
447
+ if hasattr(oauth_profile, "username") and oauth_profile.username
448
+ else (oauth_profile.name if hasattr(oauth_profile, "name") and oauth_profile.name else None)
449
+ )
450
+
451
+ # Check if user is logged in (OAuth token or env var)
452
+ # Fallback to env vars for local development or Spaces with HF_TOKEN secret
453
+ has_authentication = bool(
454
+ token_value
455
+ or os.getenv("HF_TOKEN")
456
+ or os.getenv("HUGGINGFACE_API_KEY")
457
+ )
458
+
459
+ if not has_authentication:
460
+ yield {
461
+ "role": "assistant",
462
+ "content": (
463
+ "🔐 **Authentication Required**\n\n"
464
+ "Please **sign in with HuggingFace** using the login button at the top of the page "
465
+ "before using this application.\n\n"
466
+ "The login button is required to access the AI models and research tools."
467
+ ),
468
+ }
469
+ return
470
+
471
  if not message.strip():
472
+ yield {
473
+ "role": "assistant",
474
+ "content": "Please enter a research question.",
475
+ }
476
  return
477
 
478
+ # Check available keys (use token_value instead of oauth_token)
479
+ has_huggingface = bool(os.getenv("HF_TOKEN") or os.getenv("HUGGINGFACE_API_KEY") or token_value)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
480
 
481
+ # Adjust mode if needed
482
+ effective_mode = mode
483
+ if mode == "advanced":
484
+ effective_mode = "simple"
 
 
 
 
 
 
 
 
485
 
486
+ # Yield authentication and mode status messages
487
+ async for msg in yield_auth_messages(username, token_value, has_huggingface, mode):
488
+ yield msg
489
 
490
+ # Run the agent and stream events
491
  try:
492
  # use_mock=False - let configure_orchestrator decide based on available keys
493
+ # It will use: OAuth token > Env vars > HF Inference (free tier)
494
+ # Convert empty strings from Textbox to None for defaults
495
+ model_id = hf_model if hf_model and hf_model.strip() else None
496
+ provider_name = hf_provider if hf_provider and hf_provider.strip() else None
497
+
498
  orchestrator, backend_name = configure_orchestrator(
499
  use_mock=False, # Never use mock in production - HF Inference is the free fallback
500
+ mode=effective_mode,
501
+ oauth_token=token_value, # Use extracted token value
502
+ hf_model=model_id, # None will use defaults in configure_orchestrator
503
+ hf_provider=provider_name, # None will use defaults in configure_orchestrator
504
  )
505
 
506
+ yield {
507
+ "role": "assistant",
508
+ "content": f"🧠 **Backend**: {backend_name}\n\n",
509
+ }
 
 
510
 
511
+ # Handle orchestrator events
512
+ async for msg in handle_orchestrator_events(orchestrator, message):
513
+ yield msg
 
 
 
514
 
515
  except Exception as e:
516
+ # Return error message without metadata to avoid issues during example caching
517
+ # Metadata can cause validation errors when Gradio caches examples
518
+ # Gradio Chatbot requires plain text - remove all markdown and special characters
519
+ error_msg = str(e).replace("**", "").replace("*", "").replace("`", "")
520
+ # Ensure content is a simple string without any special formatting
521
+ yield {
522
+ "role": "assistant",
523
+ "content": f"Error: {error_msg}. Please check your configuration and try again.",
524
+ }
525
+
526
+
527
+ def create_demo() -> gr.Blocks:
528
  """
529
+ Create the Gradio demo interface with MCP support and OAuth login.
530
 
531
  Returns:
532
+ Configured Gradio Blocks interface with MCP server and OAuth enabled
533
  """
534
+ with gr.Blocks(title="🧬 DeepCritical") as demo:
535
+ # Add login button at the top in a visible Row container
536
+ # LoginButton must be visible and properly configured for OAuth to work
537
+ # Using a Row with scale ensures the button is displayed prominently at the top
538
+ with gr.Row(equal_height=False):
539
+ with gr.Column(scale=1, min_width=200):
540
+ login_btn = gr.LoginButton(
541
+ value="Sign in with Hugging Face",
542
+ variant="huggingface",
543
+ size="lg",
544
+ )
545
+
546
+ # Create settings components (hidden - used only for additional_inputs)
547
+ # Model/provider selection removed to avoid dropdown value mismatch errors
548
+ # Settings will use defaults from configure_orchestrator
549
+ with gr.Row(visible=False):
550
+ mode_radio = gr.Radio(
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
551
  choices=["simple", "advanced"],
552
  value="simple",
553
  label="Orchestrator Mode",
554
+ info="Simple: Linear | Advanced: Multi-Agent (Requires OpenAI)",
555
+ )
556
+
557
+ # Hidden text components for model/provider (not dropdowns to avoid value mismatch)
558
+ # These will be empty by default and use defaults in configure_orchestrator
559
+ hf_model_dropdown = gr.Textbox(
560
+ value="", # Empty string - will be converted to None in research_agent
561
+ label="🤖 Reasoning Model",
562
+ visible=False, # Hidden from UI
563
+ )
564
+
565
+ hf_provider_dropdown = gr.Textbox(
566
+ value="", # Empty string - will be converted to None in research_agent
567
+ label=" Inference Provider",
568
+ visible=False, # Hidden from UI
569
+ )
570
+
571
+ # Chat interface with model/provider selection
572
+ # Examples are provided but will NOT run at startup (cache_examples=False)
573
+ # Users must log in first before using examples or submitting queries
574
+ gr.ChatInterface(
575
+ fn=research_agent,
576
+ title="🧬 DeepCritical",
577
+ description=(
578
+ "*AI-Powered Drug Repurposing Agent — searches PubMed, "
579
+ "ClinicalTrials.gov & Europe PMC*\n\n"
580
+ "---\n"
581
+ "*Research tool only — not for medical advice.* \n"
582
+ "**MCP Server Active**: Connect Claude Desktop to `/gradio_api/mcp/`\n\n"
583
+ "**⚠️ Authentication Required**: Please **sign in with HuggingFace** above before using this application."
584
  ),
585
+ examples=[
586
+ # When additional_inputs are provided, examples must be lists of lists
587
+ # Each inner list: [message, mode, hf_model, hf_provider]
588
+ # Using actual model IDs and provider names from inference_models.py
589
+ # Note: Provider is optional - if empty, HF will auto-select
590
+ # These examples will NOT run at startup - users must click them after logging in
591
+ [
592
+ "What drugs could be repurposed for Alzheimer's disease?",
593
+ "simple",
594
+ "Qwen/Qwen3-Next-80B-A3B-Thinking",
595
+ "",
596
+ ],
597
+ [
598
+ "Is metformin effective for treating cancer?",
599
+ "simple",
600
+ "Qwen/Qwen3-235B-A22B-Instruct-2507",
601
+ "",
602
+ ],
603
+ [
604
+ "What medications show promise for Long COVID treatment?",
605
+ "simple",
606
+ "zai-org/GLM-4.5-Air",
607
+ "nebius",
608
+ ],
609
+
610
+ ],
611
+ cache_examples=False, # CRITICAL: Disable example caching to prevent examples from running at startup
612
+ # Examples will only run when user explicitly clicks them (after login)
613
+ additional_inputs_accordion=gr.Accordion(label="⚙️ Settings", open=True, visible=True),
614
+ additional_inputs=[
615
+ mode_radio,
616
+ hf_model_dropdown,
617
+ hf_provider_dropdown,
618
+ # Note: gr.OAuthToken and gr.OAuthProfile are automatically passed as function parameters
619
+ # when user is logged in - they should NOT be added to additional_inputs
620
+ ],
621
+ )
622
 
623
+ return demo # type: ignore[no-any-return]
624
 
625
 
626
  def main() -> None:
src/middleware/state_machine.py CHANGED
@@ -127,3 +127,7 @@ def get_workflow_state() -> WorkflowState:
127
  logger.debug("Workflow state not found, auto-initializing")
128
  return init_workflow_state()
129
  return state
 
 
 
 
 
127
  logger.debug("Workflow state not found, auto-initializing")
128
  return init_workflow_state()
129
  return state
130
+
131
+
132
+
133
+
src/tools/crawl_adapter.py CHANGED
@@ -56,3 +56,7 @@ async def crawl_website(starting_url: str) -> str:
56
  except Exception as e:
57
  logger.error("Crawl failed", error=str(e), url=starting_url)
58
  return f"Error crawling website: {e!s}"
 
 
 
 
 
56
  except Exception as e:
57
  logger.error("Crawl failed", error=str(e), url=starting_url)
58
  return f"Error crawling website: {e!s}"
59
+
60
+
61
+
62
+
src/tools/web_search_adapter.py CHANGED
@@ -61,3 +61,7 @@ async def web_search(query: str) -> str:
61
  except Exception as e:
62
  logger.error("Web search failed", error=str(e), query=query)
63
  return f"Error performing web search: {e!s}"
 
 
 
 
 
61
  except Exception as e:
62
  logger.error("Web search failed", error=str(e), query=query)
63
  return f"Error performing web search: {e!s}"
64
+
65
+
66
+
67
+
tests/unit/middleware/__init__.py CHANGED
@@ -1 +1,15 @@
1
  """Unit tests for middleware components."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  """Unit tests for middleware components."""
2
+
3
+
4
+
5
+
6
+
7
+
8
+
9
+
10
+
11
+
12
+
13
+
14
+
15
+
tests/unit/middleware/test_budget_tracker_phase7.py CHANGED
@@ -157,3 +157,17 @@ class TestIterationTokenTracking:
157
  assert budget2 is not None
158
  assert budget1.iteration_tokens[1] == 100
159
  assert budget2.iteration_tokens[1] == 200
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
157
  assert budget2 is not None
158
  assert budget1.iteration_tokens[1] == 100
159
  assert budget2.iteration_tokens[1] == 200
160
+
161
+
162
+
163
+
164
+
165
+
166
+
167
+
168
+
169
+
170
+
171
+
172
+
173
+
tests/unit/middleware/test_state_machine.py CHANGED
@@ -354,3 +354,17 @@ class TestContextVarIsolation:
354
  assert len(state2.evidence) == 1
355
  assert state1.evidence[0].citation.url == "https://example.com/1"
356
  assert state2.evidence[0].citation.url == "https://example.com/2"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
354
  assert len(state2.evidence) == 1
355
  assert state1.evidence[0].citation.url == "https://example.com/1"
356
  assert state2.evidence[0].citation.url == "https://example.com/2"
357
+
358
+
359
+
360
+
361
+
362
+
363
+
364
+
365
+
366
+
367
+
368
+
369
+
370
+
tests/unit/middleware/test_workflow_manager.py CHANGED
@@ -284,3 +284,17 @@ class TestWorkflowManager:
284
 
285
  assert len(shared) == 1
286
  assert shared[0].content == "Shared"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
284
 
285
  assert len(shared) == 1
286
  assert shared[0].content == "Shared"
287
+
288
+
289
+
290
+
291
+
292
+
293
+
294
+
295
+
296
+
297
+
298
+
299
+
300
+
tests/unit/orchestrator/__init__.py CHANGED
@@ -1 +1,15 @@
1
  """Unit tests for orchestrator module."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  """Unit tests for orchestrator module."""
2
+
3
+
4
+
5
+
6
+
7
+
8
+
9
+
10
+
11
+
12
+
13
+
14
+
15
+