Spaces:
Running
Running
| # Phase 1 Implementation Spec: Foundation & Tooling | |
| **Goal**: Establish a "Gucci Banger" development environment using 2025 best practices. | |
| **Philosophy**: "If the build isn't solid, the agent won't be." | |
| --- | |
| ## 1. Prerequisites | |
| Before starting, ensure these are installed: | |
| ```bash | |
| # Install uv (Rust-based package manager) | |
| curl -LsSf https://astral.sh/uv/install.sh | sh | |
| # Verify | |
| uv --version # Should be >= 0.4.0 | |
| ``` | |
| --- | |
| ## 2. Project Initialization | |
| ```bash | |
| # From project root | |
| uv init --name deepcritical | |
| uv python install 3.11 # Pin Python version | |
| ``` | |
| --- | |
| ## 3. The Tooling Stack (Exact Dependencies) | |
| ### `pyproject.toml` (Complete, Copy-Paste Ready) | |
| ```toml | |
| [project] | |
| name = "deepcritical" | |
| version = "0.1.0" | |
| description = "AI-Native Drug Repurposing Research Agent" | |
| readme = "README.md" | |
| requires-python = ">=3.11" | |
| dependencies = [ | |
| # Core | |
| "pydantic>=2.7", | |
| "pydantic-settings>=2.2", # For BaseSettings (config) | |
| "pydantic-ai>=0.0.16", # Agent framework | |
| # HTTP & Parsing | |
| "httpx>=0.27", # Async HTTP client | |
| "beautifulsoup4>=4.12", # HTML parsing | |
| "xmltodict>=0.13", # PubMed XML -> dict | |
| # Search | |
| "duckduckgo-search>=6.0", # Free web search | |
| # UI | |
| "gradio>=5.0", # Chat interface | |
| # Utils | |
| "python-dotenv>=1.0", # .env loading | |
| "tenacity>=8.2", # Retry logic | |
| "structlog>=24.1", # Structured logging | |
| ] | |
| [project.optional-dependencies] | |
| dev = [ | |
| # Testing | |
| "pytest>=8.0", | |
| "pytest-asyncio>=0.23", | |
| "pytest-sugar>=1.0", | |
| "pytest-cov>=5.0", | |
| "pytest-mock>=3.12", | |
| "respx>=0.21", # Mock httpx requests | |
| # Quality | |
| "ruff>=0.4.0", | |
| "mypy>=1.10", | |
| "pre-commit>=3.7", | |
| ] | |
| [build-system] | |
| requires = ["hatchling"] | |
| build-backend = "hatchling.build" | |
| [tool.hatch.build.targets.wheel] | |
| packages = ["src"] | |
| # ============== RUFF CONFIG ============== | |
| [tool.ruff] | |
| line-length = 100 | |
| target-version = "py311" | |
| src = ["src", "tests"] | |
| [tool.ruff.lint] | |
| select = [ | |
| "E", # pycodestyle errors | |
| "F", # pyflakes | |
| "B", # flake8-bugbear | |
| "I", # isort | |
| "N", # pep8-naming | |
| "UP", # pyupgrade | |
| "PL", # pylint | |
| "RUF", # ruff-specific | |
| ] | |
| ignore = [ | |
| "PLR0913", # Too many arguments (agents need many params) | |
| ] | |
| [tool.ruff.lint.isort] | |
| known-first-party = ["src"] | |
| # ============== MYPY CONFIG ============== | |
| [tool.mypy] | |
| python_version = "3.11" | |
| strict = true | |
| ignore_missing_imports = true | |
| disallow_untyped_defs = true | |
| warn_return_any = true | |
| warn_unused_ignores = true | |
| # ============== PYTEST CONFIG ============== | |
| [tool.pytest.ini_options] | |
| testpaths = ["tests"] | |
| asyncio_mode = "auto" | |
| addopts = [ | |
| "-v", | |
| "--tb=short", | |
| "--strict-markers", | |
| ] | |
| markers = [ | |
| "unit: Unit tests (mocked)", | |
| "integration: Integration tests (real APIs)", | |
| "slow: Slow tests", | |
| ] | |
| # ============== COVERAGE CONFIG ============== | |
| [tool.coverage.run] | |
| source = ["src"] | |
| omit = ["*/__init__.py"] | |
| [tool.coverage.report] | |
| exclude_lines = [ | |
| "pragma: no cover", | |
| "if TYPE_CHECKING:", | |
| "raise NotImplementedError", | |
| ] | |
| ``` | |
| --- | |
| ## 4. Directory Structure (Maintainer's Structure) | |
| ```bash | |
| # Execute these commands to create the directory structure | |
| mkdir -p src/utils | |
| mkdir -p src/tools | |
| mkdir -p src/prompts | |
| mkdir -p src/agent_factory | |
| mkdir -p src/middleware | |
| mkdir -p src/database_services | |
| mkdir -p src/retrieval_factory | |
| mkdir -p tests/unit/tools | |
| mkdir -p tests/unit/agent_factory | |
| mkdir -p tests/unit/utils | |
| mkdir -p tests/integration | |
| # Create __init__.py files (required for imports) | |
| touch src/__init__.py | |
| touch src/utils/__init__.py | |
| touch src/tools/__init__.py | |
| touch src/prompts/__init__.py | |
| touch src/agent_factory/__init__.py | |
| touch tests/__init__.py | |
| touch tests/unit/__init__.py | |
| touch tests/unit/tools/__init__.py | |
| touch tests/unit/agent_factory/__init__.py | |
| touch tests/unit/utils/__init__.py | |
| touch tests/integration/__init__.py | |
| ``` | |
| ### Final Structure: | |
| ``` | |
| src/ | |
| ├── __init__.py | |
| ├── app.py # Entry point (Gradio UI) | |
| ├── orchestrator.py # Agent loop | |
| ├── agent_factory/ # Agent creation and judges | |
| │ ├── __init__.py | |
| │ ├── agents.py | |
| │ └── judges.py | |
| ├── tools/ # Search tools | |
| │ ├── __init__.py | |
| │ ├── pubmed.py | |
| │ ├── websearch.py | |
| │ └── search_handler.py | |
| ├── prompts/ # Prompt templates | |
| │ ├── __init__.py | |
| │ └── judge.py | |
| ├── utils/ # Shared utilities | |
| │ ├── __init__.py | |
| │ ├── config.py | |
| │ ├── exceptions.py | |
| │ ├── models.py | |
| │ ├── dataloaders.py | |
| │ └── parsers.py | |
| ├── middleware/ # (Future) | |
| ├── database_services/ # (Future) | |
| └── retrieval_factory/ # (Future) | |
| tests/ | |
| ├── __init__.py | |
| ├── conftest.py | |
| ├── unit/ | |
| │ ├── __init__.py | |
| │ ├── tools/ | |
| │ │ ├── __init__.py | |
| │ │ ├── test_pubmed.py | |
| │ │ ├── test_websearch.py | |
| │ │ └── test_search_handler.py | |
| │ ├── agent_factory/ | |
| │ │ ├── __init__.py | |
| │ │ └── test_judges.py | |
| │ ├── utils/ | |
| │ │ ├── __init__.py | |
| │ │ └── test_config.py | |
| │ └── test_orchestrator.py | |
| └── integration/ | |
| ├── __init__.py | |
| └── test_pubmed_live.py | |
| ``` | |
| --- | |
| ## 5. Configuration Files | |
| ### `.env.example` (Copy to `.env` and fill) | |
| ```bash | |
| # LLM Provider (choose one) | |
| OPENAI_API_KEY=sk-your-key-here | |
| ANTHROPIC_API_KEY=sk-ant-your-key-here | |
| # Optional: PubMed API key (higher rate limits) | |
| NCBI_API_KEY=your-ncbi-key-here | |
| # Optional: For HuggingFace deployment | |
| HF_TOKEN=hf_your-token-here | |
| # Agent Config | |
| MAX_ITERATIONS=10 | |
| LOG_LEVEL=INFO | |
| ``` | |
| ### `.pre-commit-config.yaml` | |
| ```yaml | |
| repos: | |
| - repo: https://github.com/astral-sh/ruff-pre-commit | |
| rev: v0.4.4 | |
| hooks: | |
| - id: ruff | |
| args: [--fix] | |
| - id: ruff-format | |
| - repo: https://github.com/pre-commit/mirrors-mypy | |
| rev: v1.10.0 | |
| hooks: | |
| - id: mypy | |
| additional_dependencies: | |
| - pydantic>=2.7 | |
| - pydantic-settings>=2.2 | |
| args: [--ignore-missing-imports] | |
| ``` | |
| ### `tests/conftest.py` (Pytest Fixtures) | |
| ```python | |
| """Shared pytest fixtures for all tests.""" | |
| import pytest | |
| from unittest.mock import AsyncMock | |
| @pytest.fixture | |
| def mock_httpx_client(mocker): | |
| """Mock httpx.AsyncClient for API tests.""" | |
| mock = mocker.patch("httpx.AsyncClient") | |
| mock.return_value.__aenter__ = AsyncMock(return_value=mock.return_value) | |
| mock.return_value.__aexit__ = AsyncMock(return_value=None) | |
| return mock | |
| @pytest.fixture | |
| def mock_llm_response(): | |
| """Factory fixture for mocking LLM responses.""" | |
| def _mock(content: str): | |
| return AsyncMock(return_value=content) | |
| return _mock | |
| @pytest.fixture | |
| def sample_evidence(): | |
| """Sample Evidence objects for testing.""" | |
| from src.utils.models import Evidence, Citation | |
| return [ | |
| Evidence( | |
| content="Metformin shows promise in Alzheimer's...", | |
| citation=Citation( | |
| source="pubmed", | |
| title="Metformin and Alzheimer's Disease", | |
| url="https://pubmed.ncbi.nlm.nih.gov/12345678/", | |
| date="2024-01-15" | |
| ), | |
| relevance=0.85 | |
| ) | |
| ] | |
| ``` | |
| --- | |
| ## 6. Core Utilities Implementation | |
| ### `src/utils/config.py` | |
| ```python | |
| """Application configuration using Pydantic Settings.""" | |
| from pydantic_settings import BaseSettings, SettingsConfigDict | |
| from pydantic import Field | |
| from typing import Literal | |
| import structlog | |
| class Settings(BaseSettings): | |
| """Strongly-typed application settings.""" | |
| model_config = SettingsConfigDict( | |
| env_file=".env", | |
| env_file_encoding="utf-8", | |
| case_sensitive=False, | |
| extra="ignore", | |
| ) | |
| # LLM Configuration | |
| openai_api_key: str | None = Field(default=None, description="OpenAI API key") | |
| anthropic_api_key: str | None = Field(default=None, description="Anthropic API key") | |
| llm_provider: Literal["openai", "anthropic"] = Field( | |
| default="openai", | |
| description="Which LLM provider to use" | |
| ) | |
| openai_model: str = Field(default="gpt-4o", description="OpenAI model name") | |
| anthropic_model: str = Field(default="claude-3-5-sonnet-20241022", description="Anthropic model") | |
| # PubMed Configuration | |
| ncbi_api_key: str | None = Field(default=None, description="NCBI API key for higher rate limits") | |
| # Agent Configuration | |
| max_iterations: int = Field(default=10, ge=1, le=50) | |
| search_timeout: int = Field(default=30, description="Seconds to wait for search") | |
| # Logging | |
| log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR"] = "INFO" | |
| def get_api_key(self) -> str: | |
| """Get the API key for the configured provider.""" | |
| if self.llm_provider == "openai": | |
| if not self.openai_api_key: | |
| raise ValueError("OPENAI_API_KEY not set") | |
| return self.openai_api_key | |
| else: | |
| if not self.anthropic_api_key: | |
| raise ValueError("ANTHROPIC_API_KEY not set") | |
| return self.anthropic_api_key | |
| def get_settings() -> Settings: | |
| """Factory function to get settings (allows mocking in tests).""" | |
| return Settings() | |
| def configure_logging(settings: Settings) -> None: | |
| """Configure structured logging.""" | |
| structlog.configure( | |
| processors=[ | |
| structlog.stdlib.filter_by_level, | |
| structlog.stdlib.add_logger_name, | |
| structlog.stdlib.add_log_level, | |
| structlog.processors.TimeStamper(fmt="iso"), | |
| structlog.processors.JSONRenderer(), | |
| ], | |
| wrapper_class=structlog.stdlib.BoundLogger, | |
| context_class=dict, | |
| logger_factory=structlog.stdlib.LoggerFactory(), | |
| ) | |
| # Singleton for easy import | |
| settings = get_settings() | |
| ``` | |
| ### `src/utils/exceptions.py` | |
| ```python | |
| """Custom exceptions for DeepCritical.""" | |
| class DeepCriticalError(Exception): | |
| """Base exception for all DeepCritical errors.""" | |
| pass | |
| class SearchError(DeepCriticalError): | |
| """Raised when a search operation fails.""" | |
| pass | |
| class JudgeError(DeepCriticalError): | |
| """Raised when the judge fails to assess evidence.""" | |
| pass | |
| class ConfigurationError(DeepCriticalError): | |
| """Raised when configuration is invalid.""" | |
| pass | |
| class RateLimitError(SearchError): | |
| """Raised when we hit API rate limits.""" | |
| pass | |
| ``` | |
| --- | |
| ## 7. TDD Workflow: First Test | |
| ### `tests/unit/utils/test_config.py` | |
| ```python | |
| """Unit tests for configuration loading.""" | |
| import pytest | |
| from unittest.mock import patch | |
| import os | |
| class TestSettings: | |
| """Tests for Settings class.""" | |
| def test_default_max_iterations(self): | |
| """Settings should have default max_iterations of 10.""" | |
| from src.utils.config import Settings | |
| # Clear any env vars | |
| with patch.dict(os.environ, {}, clear=True): | |
| settings = Settings() | |
| assert settings.max_iterations == 10 | |
| def test_max_iterations_from_env(self): | |
| """Settings should read MAX_ITERATIONS from env.""" | |
| from src.utils.config import Settings | |
| with patch.dict(os.environ, {"MAX_ITERATIONS": "25"}): | |
| settings = Settings() | |
| assert settings.max_iterations == 25 | |
| def test_invalid_max_iterations_raises(self): | |
| """Settings should reject invalid max_iterations.""" | |
| from src.utils.config import Settings | |
| from pydantic import ValidationError | |
| with patch.dict(os.environ, {"MAX_ITERATIONS": "100"}): | |
| with pytest.raises(ValidationError): | |
| Settings() # 100 > 50 (max) | |
| def test_get_api_key_openai(self): | |
| """get_api_key should return OpenAI key when provider is openai.""" | |
| from src.utils.config import Settings | |
| with patch.dict(os.environ, { | |
| "LLM_PROVIDER": "openai", | |
| "OPENAI_API_KEY": "sk-test-key" | |
| }): | |
| settings = Settings() | |
| assert settings.get_api_key() == "sk-test-key" | |
| def test_get_api_key_missing_raises(self): | |
| """get_api_key should raise when key is not set.""" | |
| from src.utils.config import Settings | |
| with patch.dict(os.environ, {"LLM_PROVIDER": "openai"}, clear=True): | |
| settings = Settings() | |
| with pytest.raises(ValueError, match="OPENAI_API_KEY not set"): | |
| settings.get_api_key() | |
| ``` | |
| --- | |
| ## 8. Makefile (Developer Experience) | |
| Create a `Makefile` for standard devex commands: | |
| ```makefile | |
| .PHONY: install test lint format typecheck check clean | |
| install: | |
| uv sync --all-extras | |
| uv run pre-commit install | |
| test: | |
| uv run pytest tests/unit/ -v | |
| test-cov: | |
| uv run pytest --cov=src --cov-report=term-missing | |
| lint: | |
| uv run ruff check src tests | |
| format: | |
| uv run ruff format src tests | |
| typecheck: | |
| uv run mypy src | |
| check: lint typecheck test | |
| @echo "All checks passed!" | |
| clean: | |
| rm -rf .pytest_cache .mypy_cache .ruff_cache __pycache__ .coverage | |
| find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true | |
| ``` | |
| --- | |
| ## 9. Execution Commands | |
| ```bash | |
| # Install all dependencies | |
| uv sync --all-extras | |
| # Run tests (should pass after implementing config.py) | |
| uv run pytest tests/unit/utils/test_config.py -v | |
| # Run full test suite with coverage | |
| uv run pytest --cov=src --cov-report=term-missing | |
| # Run linting | |
| uv run ruff check src tests | |
| uv run ruff format src tests | |
| # Run type checking | |
| uv run mypy src | |
| # Set up pre-commit hooks | |
| uv run pre-commit install | |
| ``` | |
| --- | |
| ## 10. Implementation Checklist | |
| - [ ] Install `uv` and verify version | |
| - [ ] Run `uv init --name deepcritical` | |
| - [ ] Create `pyproject.toml` (copy from above) | |
| - [ ] Create directory structure (run mkdir commands) | |
| - [ ] Create `.env.example` and `.env` | |
| - [ ] Create `.pre-commit-config.yaml` | |
| - [ ] Create `Makefile` (copy from above) | |
| - [ ] Create `tests/conftest.py` | |
| - [ ] Implement `src/utils/config.py` | |
| - [ ] Implement `src/utils/exceptions.py` | |
| - [ ] Write tests in `tests/unit/utils/test_config.py` | |
| - [ ] Run `make install` | |
| - [ ] Run `make check` — **ALL CHECKS MUST PASS** | |
| - [ ] Commit: `git commit -m "feat: phase 1 foundation complete"` | |
| --- | |
| ## 11. Definition of Done | |
| Phase 1 is **COMPLETE** when: | |
| 1. `uv run pytest` passes with 100% of tests green | |
| 2. `uv run ruff check src tests` has 0 errors | |
| 3. `uv run mypy src` has 0 errors | |
| 4. Pre-commit hooks are installed and working | |
| 5. `from src.utils.config import settings` works in Python REPL | |
| **Proceed to Phase 2 ONLY after all checkboxes are complete.** | |