Spaces:
Running
Running
File size: 3,924 Bytes
026ee5d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# Tools Architecture
DeepCritical implements a protocol-based search tool system for retrieving evidence from multiple sources.
## SearchTool Protocol
All tools implement the `SearchTool` protocol from `src/tools/base.py`:
```python
class SearchTool(Protocol):
@property
def name(self) -> str: ...
async def search(
self,
query: str,
max_results: int = 10
) -> list[Evidence]: ...
```
## Rate Limiting
All tools use the `@retry` decorator from tenacity:
```python
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(...)
)
async def search(self, query: str, max_results: int = 10) -> list[Evidence]:
# Implementation
```
Tools with API rate limits implement `_rate_limit()` method and use shared rate limiters from `src/tools/rate_limiter.py`.
## Error Handling
Tools raise custom exceptions:
- `SearchError`: General search failures
- `RateLimitError`: Rate limit exceeded
Tools handle HTTP errors (429, 500, timeout) and return empty lists on non-critical errors (with warning logs).
## Query Preprocessing
Tools use `preprocess_query()` from `src/tools/query_utils.py` to:
- Remove noise from queries
- Expand synonyms
- Normalize query format
## Evidence Conversion
All tools convert API responses to `Evidence` objects with:
- `Citation`: Title, URL, date, authors
- `content`: Evidence text
- `relevance_score`: 0.0-1.0 relevance score
- `metadata`: Additional metadata
Missing fields are handled gracefully with defaults.
## Tool Implementations
### PubMed Tool
**File**: `src/tools/pubmed.py`
**API**: NCBI E-utilities (ESearch → EFetch)
**Rate Limiting**:
- 0.34s between requests (3 req/sec without API key)
- 0.1s between requests (10 req/sec with NCBI API key)
**Features**:
- XML parsing with `xmltodict`
- Handles single vs. multiple articles
- Query preprocessing
- Evidence conversion with metadata extraction
### ClinicalTrials Tool
**File**: `src/tools/clinicaltrials.py`
**API**: ClinicalTrials.gov API v2
**Important**: Uses `requests` library (NOT httpx) because WAF blocks httpx TLS fingerprint.
**Execution**: Runs in thread pool: `await asyncio.to_thread(requests.get, ...)`
**Filtering**:
- Only interventional studies
- Status: `COMPLETED`, `ACTIVE_NOT_RECRUITING`, `RECRUITING`, `ENROLLING_BY_INVITATION`
**Features**:
- Parses nested JSON structure
- Extracts trial metadata
- Evidence conversion
### Europe PMC Tool
**File**: `src/tools/europepmc.py`
**API**: Europe PMC REST API
**Features**:
- Handles preprint markers: `[PREPRINT - Not peer-reviewed]`
- Builds URLs from DOI or PMID
- Checks `pubTypeList` for preprint detection
- Includes both preprints and peer-reviewed articles
### RAG Tool
**File**: `src/tools/rag_tool.py`
**Purpose**: Semantic search within collected evidence
**Implementation**: Wraps `LlamaIndexRAGService`
**Features**:
- Returns Evidence from RAG results
- Handles evidence ingestion
- Semantic similarity search
- Metadata preservation
### Search Handler
**File**: `src/tools/search_handler.py`
**Purpose**: Orchestrates parallel searches across multiple tools
**Features**:
- Uses `asyncio.gather()` with `return_exceptions=True`
- Aggregates results into `SearchResult`
- Handles tool failures gracefully
- Deduplicates results by URL
## Tool Registration
Tools are registered in the search handler:
```python
from src.tools.pubmed import PubMedTool
from src.tools.clinicaltrials import ClinicalTrialsTool
from src.tools.europepmc import EuropePMCTool
search_handler = SearchHandler(
tools=[
PubMedTool(),
ClinicalTrialsTool(),
EuropePMCTool(),
]
)
```
## See Also
- [Services](services.md) - RAG and embedding services
- [API Reference - Tools](../api/tools.md) - API documentation
- [Contributing - Implementation Patterns](../contributing/implementation-patterns.md) - Development guidelines
|