# GitHub Actions CI/CD Setup This directory contains GitHub Actions workflows for automated testing of the CLI redaction application. ## Workflows Overview ### 1. **Simple Test Run** (`.github/workflows/simple-test.yml`) - **Purpose**: Basic test execution - **Triggers**: Push to main/dev, Pull requests - **OS**: Ubuntu Latest - **Python**: 3.11 - **Features**: - Installs system dependencies - Sets up test data - Runs CLI tests - Runs pytest ### 2. **Comprehensive CI/CD** (`.github/workflows/ci.yml`) - **Purpose**: Full CI/CD pipeline - **Features**: - Linting (Ruff, Black) - Unit tests (Python 3.10, 3.11, 3.12) - Integration tests - Security scanning (Safety, Bandit) - Coverage reporting - Package building (on main branch) ### 3. **Multi-OS Testing** (`.github/workflows/multi-os-test.yml`) - **Purpose**: Cross-platform testing - **OS**: Ubuntu, macOS (Windows not included currently but may be reintroduced) - **Python**: 3.10, 3.11, 3.12 - **Features**: Tests compatibility across different operating systems ### 4. **Basic Test Suite** (`.github/workflows/test.yml`) - **Purpose**: Original test workflow - **Features**: - Multiple Python versions - System dependency installation - Test data creation - Coverage reporting ## Setup Scripts ### Test Data Setup (`.github/scripts/setup_test_data.py`) Creates dummy test files when example data is not available: - PDF documents - CSV files - Word documents - Images - Allow/deny lists - OCR output files ## Usage ### Running Tests Locally ```bash # Install dependencies pip install -r requirements.txt pip install pytest pytest-cov # Setup test data python .github/scripts/setup_test_data.py # Run tests cd test python test.py ``` ### GitHub Actions Triggers 1. **Push to main/dev**: Runs all tests 2. **Pull Request**: Runs tests and linting 3. **Daily Schedule**: Runs tests at 2 AM UTC 4. **Manual Trigger**: Can be triggered manually from GitHub ## Configuration ### Environment Variables - `PYTHON_VERSION`: Default Python version (3.11) - `PYTHONPATH`: Set automatically for test discovery ### Caching - Pip dependencies are cached for faster builds - Cache key based on requirements.txt hash ### Artifacts - Test results (JUnit XML) - Coverage reports (HTML, XML) - Security reports - Build artifacts (on main branch) ## Test Data The workflows automatically create test data when example files are missing: ### Required Files Created: - `example_data/example_of_emails_sent_to_a_professor_before_applying.pdf` - `example_data/combined_case_notes.csv` - `example_data/Bold minimalist professional cover letter.docx` - `example_data/example_complaint_letter.jpg` - `example_data/test_allow_list_*.csv` - `example_data/partnership_toolkit_redact_*.csv` - `example_data/example_outputs/doubled_output_joined.pdf_ocr_output.csv` ### Dependencies Installed: - **System**: tesseract-ocr, poppler-utils, OpenGL libraries - **Python**: All requirements.txt packages + pytest, reportlab, pillow ## Workflow Status ### Success Criteria: - ✅ All tests pass - ✅ No linting errors - ✅ Security checks pass - ✅ Coverage meets threshold (if configured) ### Failure Handling: - Tests are designed to skip gracefully if files are missing - AWS tests are expected to fail without credentials - System dependency failures are handled with fallbacks ## Customization ### Adding New Tests: 1. Add test methods to `test/test.py` 2. Update test data in `setup_test_data.py` if needed 3. Tests will automatically run in all workflows ### Modifying Workflows: 1. Edit the appropriate `.yml` file 2. Test locally first 3. Push to trigger the workflow ### Environment-Specific Settings: - **Ubuntu**: Full system dependencies - **Windows**: Python packages only - **macOS**: Homebrew dependencies ## Troubleshooting ### Common Issues: 1. **Missing Dependencies**: - Check system dependency installation - Verify Python package versions 2. **Test Failures**: - Check test data creation - Verify file paths - Review test output logs 3. **AWS Test Failures**: - Expected without credentials - Tests are designed to handle this gracefully 4. **System Dependency Issues**: - Different OS have different requirements - Check the specific OS section in workflows ### Debug Mode: Add `--verbose` or `-v` flags to pytest commands for more detailed output. ## Security - Dependencies are scanned with Safety - Code is scanned with Bandit - No secrets are exposed in logs - Test data is temporary and cleaned up ## Performance - Tests run in parallel where possible - Dependencies are cached - Only necessary system packages are installed - Test data is created efficiently ## Monitoring - Workflow status is visible in GitHub Actions tab - Coverage reports are uploaded to Codecov - Test results are available as artifacts - Security reports are generated and stored