Spaces:
Running
Running
File size: 4,919 Bytes
026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d d45d242 026ee5d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
# Services Architecture
DeepCritical provides several services for embeddings, RAG, and statistical analysis.
## Embedding Service
**File**: `src/services/embeddings.py`
**Purpose**: Local sentence-transformers for semantic search and deduplication
**Features**:
- **No API Key Required**: Uses local sentence-transformers models
- **Async-Safe**: All operations use `run_in_executor()` to avoid blocking the event loop
- **ChromaDB Storage**: In-memory vector storage for embeddings
- **Deduplication**: 0.9 similarity threshold by default (90% similarity = duplicate, configurable)
**Model**: Configurable via `settings.local_embedding_model` (default: `all-MiniLM-L6-v2`)
**Methods**:
- `async def embed(text: str) -> list[float]`: Generate embeddings (async-safe via `run_in_executor()`)
- `async def embed_batch(texts: list[str]) -> list[list[float]]`: Batch embedding (more efficient)
- `async def add_evidence(evidence_id: str, content: str, metadata: dict[str, Any]) -> None`: Add evidence to vector store
- `async def search_similar(query: str, n_results: int = 5) -> list[dict[str, Any]]`: Find semantically similar evidence
- `async def deduplicate(new_evidence: list[Evidence], threshold: float = 0.9) -> list[Evidence]`: Remove semantically duplicate evidence
**Usage**:
```python
from src.services.embeddings import get_embedding_service
service = get_embedding_service()
embedding = await service.embed("text to embed")
```
## LlamaIndex RAG Service
**File**: `src/services/llamaindex_rag.py`
**Purpose**: Retrieval-Augmented Generation using LlamaIndex
**Features**:
- **Multiple Embedding Providers**: OpenAI embeddings (requires `OPENAI_API_KEY`) or local sentence-transformers (no API key)
- **Multiple LLM Providers**: HuggingFace LLM (preferred) or OpenAI LLM (fallback) for query synthesis
- **ChromaDB Storage**: Vector database for document storage (supports in-memory mode)
- **Metadata Preservation**: Preserves source, title, URL, date, authors
- **Lazy Initialization**: Graceful fallback if dependencies not available
**Initialization Parameters**:
- `use_openai_embeddings: bool | None`: Force OpenAI embeddings (None = auto-detect)
- `use_in_memory: bool`: Use in-memory ChromaDB client (useful for tests)
- `oauth_token: str | None`: Optional OAuth token from HuggingFace login (takes priority over env vars)
**Methods**:
- `async def ingest_evidence(evidence: list[Evidence]) -> None`: Ingest evidence into RAG
- `async def retrieve(query: str, top_k: int = 5) -> list[Document]`: Retrieve relevant documents
- `async def query(query: str, top_k: int = 5) -> str`: Query with RAG
**Usage**:
```python
from src.services.llamaindex_rag import get_rag_service
service = get_rag_service(
use_openai_embeddings=False, # Use local embeddings
use_in_memory=True, # Use in-memory ChromaDB
oauth_token=token # Optional HuggingFace token
)
if service:
documents = await service.retrieve("query", top_k=5)
```
## Statistical Analyzer
**File**: `src/services/statistical_analyzer.py`
**Purpose**: Secure execution of AI-generated statistical code
**Features**:
- **Modal Sandbox**: Secure, isolated execution environment
- **Code Generation**: Generates Python code via LLM
- **Library Pinning**: Version-pinned libraries in `SANDBOX_LIBRARIES`
- **Network Isolation**: `block_network=True` by default
**Libraries Available**:
- pandas, numpy, scipy
- matplotlib, scikit-learn
- statsmodels
**Output**: `AnalysisResult` with:
- `verdict`: SUPPORTED, REFUTED, or INCONCLUSIVE
- `code`: Generated analysis code
- `output`: Execution output
- `error`: Error message if execution failed
**Usage**:
```python
from src.services.statistical_analyzer import StatisticalAnalyzer
analyzer = StatisticalAnalyzer()
result = await analyzer.analyze(
hypothesis="Metformin reduces cancer risk",
evidence=evidence_list
)
```
## Singleton Pattern
Services use singleton patterns for lazy initialization:
**EmbeddingService**: Uses a global variable pattern:
<!--codeinclude-->
[EmbeddingService Singleton](../src/services/embeddings.py) start_line:164 end_line:172
<!--/codeinclude-->
**LlamaIndexRAGService**: Direct instantiation (no caching):
<!--codeinclude-->
[LlamaIndexRAGService Factory](../src/services/llamaindex_rag.py) start_line:440 end_line:466
<!--/codeinclude-->
This ensures:
- Single instance per process
- Lazy initialization
- No dependencies required at import time
## Service Availability
Services check availability before use:
```python
from src.utils.config import settings
if settings.modal_available:
# Use Modal sandbox
pass
if settings.has_openai_key:
# Use OpenAI embeddings for RAG
pass
```
## See Also
- [Tools](tools.md) - How services are used by search tools
- [API Reference - Services](../api/services.md) - API documentation
- [Configuration](../configuration/index.md) - Service configuration
|