Spaces:
Running
Architecture Specification: Dual-Mode Agent System
Date: November 27, 2025 Status: SPECIFICATION Goal: Graceful degradation from full multi-agent orchestration to simple single-agent mode
1. Core Concept: Two Operating Modes
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER REQUEST β
β β β
β βΌ β
β βββββββββββββββββββ β
β β Mode Selection β β
β β (Auto-detect) β β
β ββββββββββ¬βββββββββ β
β β β
β βββββββββββββββββ΄ββββββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββββββ βββββββββββββββββββ β
β β SIMPLE MODE β β ADVANCED MODE β β
β β (Free Tier) β β (Paid Tier) β β
β β β β β β
β β pydantic-ai β β MS Agent Fwk β β
β β single-agent β β + pydantic-ai β β
β β loop β β multi-agent β β
β βββββββββββββββββββ βββββββββββββββββββ β
β β β β
β βββββββββββββββββ¬ββββββββββββββββ β
β βΌ β
β βββββββββββββββββββ β
β β Research Report β β
β β with Citations β β
β βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
2. Mode Comparison
| Aspect | Simple Mode | Advanced Mode |
|---|---|---|
| Trigger | No API key OR LLM_PROVIDER=huggingface |
OpenAI API key present (currently OpenAI only) |
| Framework | pydantic-ai only | Microsoft Agent Framework + pydantic-ai |
| Architecture | Single orchestrator loop | Multi-agent coordination |
| Agents | One agent does SearchβJudgeβReport | SearchAgent, JudgeAgent, ReportAgent, AnalysisAgent |
| State Management | Simple dict | Thread-safe MagenticState with context vars |
| Quality | Good (functional) | Better (specialized agents, coordination) |
| Cost | Free (HuggingFace Inference) | Paid (OpenAI/Anthropic) |
| Use Case | Demos, hackathon, budget-constrained | Production, research quality |
3. Simple Mode Architecture (pydantic-ai Only)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Orchestrator β
β β
β while not sufficient and iteration < max: β
β 1. SearchHandler.execute(query) β
β 2. JudgeHandler.assess(evidence) βββ pydantic-ai Agent β
β 3. if sufficient: break β
β 4. query = judge.next_queries β
β β
β return ReportGenerator.generate(evidence) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Components:
src/orchestrator.py- Simple loop orchestratorsrc/agent_factory/judges.py- JudgeHandler with pydantic-aisrc/tools/search_handler.py- Scatter-gather searchsrc/tools/pubmed.py,clinicaltrials.py,europepmc.py- Search tools
4. Advanced Mode Architecture (MS Agent Framework + pydantic-ai)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Microsoft Agent Framework Orchestrator β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β SearchAgent βββββΆβ JudgeAgent βββββΆβ ReportAgent β β
β β (BaseAgent) β β (BaseAgent) β β (BaseAgent) β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β pydantic-ai β β pydantic-ai β β pydantic-ai β β
β β Agent() β β Agent() β β Agent() β β
β β output_type=β β output_type=β β output_type=β β
β β SearchResultβ β JudgeAssess β β Report β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β Shared State: MagenticState (thread-safe via contextvars) β
β - evidence: list[Evidence] β
β - embedding_service: EmbeddingService β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Components:
src/orchestrator_magentic.py- Multi-agent orchestratorsrc/agents/search_agent.py- SearchAgent (BaseAgent)src/agents/judge_agent.py- JudgeAgent (BaseAgent)src/agents/report_agent.py- ReportAgent (BaseAgent)src/agents/analysis_agent.py- AnalysisAgent (BaseAgent)src/agents/state.py- Thread-safe state managementsrc/agents/tools.py- @ai_function decorated tools
5. Mode Selection Logic
# src/orchestrator_factory.py (actual implementation)
def create_orchestrator(
search_handler: SearchHandlerProtocol | None = None,
judge_handler: JudgeHandlerProtocol | None = None,
config: OrchestratorConfig | None = None,
mode: Literal["simple", "magentic", "advanced"] | None = None,
) -> Any:
"""
Auto-select orchestrator based on available credentials.
Priority:
1. If mode explicitly set, use that
2. If OpenAI key available -> Advanced Mode (currently OpenAI only)
3. Otherwise -> Simple Mode (HuggingFace free tier)
"""
effective_mode = _determine_mode(mode)
if effective_mode == "advanced":
orchestrator_cls = _get_magentic_orchestrator_class()
return orchestrator_cls(max_rounds=config.max_iterations if config else 10)
# Simple mode requires handlers
if search_handler is None or judge_handler is None:
raise ValueError("Simple mode requires search_handler and judge_handler")
return Orchestrator(
search_handler=search_handler,
judge_handler=judge_handler,
config=config,
)
6. Shared Components (Both Modes Use)
These components work in both modes:
| Component | Purpose |
|---|---|
src/tools/pubmed.py |
PubMed search |
src/tools/clinicaltrials.py |
ClinicalTrials.gov search |
src/tools/europepmc.py |
Europe PMC search |
src/tools/search_handler.py |
Scatter-gather orchestration |
src/tools/rate_limiter.py |
Rate limiting |
src/utils/models.py |
Evidence, Citation, JudgeAssessment |
src/utils/config.py |
Settings |
src/services/embeddings.py |
Vector search (optional) |
7. pydantic-ai Integration Points
Both modes use pydantic-ai for structured LLM outputs:
# In JudgeHandler (both modes)
from pydantic_ai import Agent
from pydantic_ai.models.huggingface import HuggingFaceModel
from pydantic_ai.models.openai import OpenAIModel
from pydantic_ai.models.anthropic import AnthropicModel
class JudgeHandler:
def __init__(self, model: Any = None):
self.model = model or get_model() # Auto-selects based on config
self.agent = Agent(
model=self.model,
output_type=JudgeAssessment, # Structured output!
system_prompt=SYSTEM_PROMPT,
)
async def assess(self, question: str, evidence: list[Evidence]) -> JudgeAssessment:
result = await self.agent.run(format_prompt(question, evidence))
return result.output # Guaranteed to be JudgeAssessment
8. Microsoft Agent Framework Integration Points
Advanced mode wraps pydantic-ai agents in BaseAgent:
# In JudgeAgent (advanced mode only)
from agent_framework import BaseAgent, AgentRunResponse, ChatMessage, Role
class JudgeAgent(BaseAgent):
def __init__(self, judge_handler: JudgeHandlerProtocol):
super().__init__(
name="JudgeAgent",
description="Evaluates evidence quality",
)
self._handler = judge_handler # Uses pydantic-ai internally
async def run(self, messages, **kwargs) -> AgentRunResponse:
question = extract_question(messages)
evidence = self._evidence_store.get("current", [])
# Delegate to pydantic-ai powered handler
assessment = await self._handler.assess(question, evidence)
return AgentRunResponse(
messages=[ChatMessage(role=Role.ASSISTANT, text=format_response(assessment))],
additional_properties={"assessment": assessment.model_dump()},
)
9. Benefits of This Architecture
- Graceful Degradation: Works without API keys (free tier)
- Progressive Enhancement: Better with API keys (orchestration)
- Code Reuse: pydantic-ai handlers shared between modes
- Hackathon Ready: Demo works without requiring paid keys
- Production Ready: Full orchestration available when needed
- Future Proof: Can add more agents to advanced mode
- Testable: Simple mode is easier to unit test
10. Known Risks and Mitigations
From Senior Agent Review
10.1 Bridge Complexity (MEDIUM)
Risk: In Advanced Mode, agents (Agent Framework) wrap handlers (pydantic-ai). Both are async. Context variables (MagenticState) must propagate correctly through the pydantic-ai call stack.
Mitigation:
- pydantic-ai uses standard Python
contextvars, which naturally propagate throughawaitchains - Test context propagation explicitly in integration tests
- If issues arise, pass state explicitly rather than via context vars
10.2 Integration Drift (MEDIUM)
Risk: Simple Mode and Advanced Mode might diverge in behavior over time (e.g., Simple Mode uses logic A, Advanced Mode uses logic B).
Mitigation:
- Both modes MUST call the exact same underlying Tools (
src/tools/*) and Handlers (src/agent_factory/*) - Handlers are the single source of truth for business logic
- Agents are thin wrappers that delegate to handlers
10.3 Testing Burden (LOW-MEDIUM)
Risk: Two distinct orchestrators (src/orchestrator.py and src/orchestrator_magentic.py) doubles integration testing surface area.
Mitigation:
- Unit test handlers independently (shared code)
- Integration tests for each mode separately
- End-to-end tests verify same output for same input (determinism permitting)
10.4 Dependency Conflicts (LOW)
Risk: agent-framework-core might conflict with pydantic-ai's dependencies (e.g., different pydantic versions).
Status: Both use pydantic>=2.x. Should be compatible.
11. Naming Clarification
See
00_SITUATION_AND_PLAN.mdSection 4 for full details.
Important: The codebase uses "magentic" in file names (orchestrator_magentic.py, magentic_agents.py) but this refers to our internal naming for Microsoft Agent Framework integration, NOT the magentic PyPI package.
Future action: Rename to orchestrator_advanced.py to eliminate confusion.