Spaces:
Running
Running
File size: 5,555 Bytes
304e233 97e46c6 304e233 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 |
# CUA2 - Computer Use Agent 2
An AI-powered automation interface featuring real-time agent task processing, VNC streaming, and step-by-step execution visualization.
## π Overview
CUA2 is a full-stack application that provides a modern web interface for AI agents to perform automated computer tasks. The system features real-time WebSocket communication between a FastAPI backend and React frontend, allowing users to monitor agent execution, view screenshots, track token usage, and stream VNC sessions.
## ποΈ Architecture

## π οΈ Tech Stack
### Backend (`cua2-core`)
- **FastAPI**
- **Uvicorn**
- **smolagents** - AI agent framework with OpenAI/LiteLLM support
### Frontend (`cua2-front`)
- **React TS**
- **Vite**
## π Prerequisites
- **Python** 3.10 or higher
- **Node.js** 18 or higher
- **npm**
- **uv** - Python package manager
### Installing uv
**macOS/Linux:**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
For more installation options, visit: https://docs.astral.sh/uv/getting-started/installation/
## π Getting Started
### 1. Clone the Repository
```bash
git clone https://github.com/huggingface/CUA2.git
cd CUA2
```
### 2. Install Dependencies
Use the Makefile for quick setup:
```bash
make sync
```
This will:
- Install Python dependencies using `uv`
- Install Node.js dependencies for the frontend
Or install manually:
```bash
# Backend dependencies
cd cua2-core
uv sync --all-extras
# Frontend dependencies
cd ../cua2-front
npm install
```
### 3. Environment Configuration
Copy the example environment file and configure your settings:
```bash
cd cua2-core
cp env.example .env
```
Edit `.env` with your configuration:
- API keys for OpenAI/LiteLLM
- Database connections (if applicable)
- Other service credentials
### 4. Start Development Servers
#### Option 1: Using Makefile (Recommended)
Open two terminal windows:
**Terminal 1 - Backend:**
```bash
make dev-backend
```
**Terminal 2 - Frontend:**
```bash
make dev-frontend
```
#### Option 2: Manual Start
**Terminal 1 - Backend:**
```bash
cd cua2-core
uv run uvicorn cua2_core.main:app --reload --host 0.0.0.0 --port 8000
```
**Terminal 2 - Frontend:**
```bash
cd cua2-front
npm run dev
```
### 5. Access the Application
- **Frontend**: http://localhost:8080
- **Backend API**: http://localhost:8000
- **API Documentation**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
## π Project Structure
```
CUA2/
βββ cua2-core/ # Backend application
β βββ src/
β β βββ cua2_core/
β β βββ app.py # FastAPI application setup
β β βββ main.py # Application entry point
β β βββ models/
β β β βββ models.py # Pydantic models
β β βββ routes/
β β β βββ routes.py # REST API endpoints
β β β βββ websocket.py # WebSocket endpoint
β β βββ services/
β β β βββ agent_service.py # Agent task processing
β β β βββ simulation_metadata/ # Demo data
β β βββ websocket/
β β βββ websocket_manager.py # WebSocket management
β βββ pyproject.toml # Python dependencies
β βββ env.example # Environment variables template
β
βββ cua2-front/ # Frontend application
β βββ src/
β β βββ App.tsx # Main application component
β β βββ pages/
β β β βββ Index.tsx # Main page
β β βββ components/
β β β βββ mock/ # UI components
β β βββ hooks/
β β β βββ useWebSocket.ts # WebSocket hook
β β βββ types/
β β βββ agent.ts # TypeScript type definitions
β βββ package.json # Node dependencies
β βββ vite.config.ts # Vite configuration
β
βββ Makefile # Development commands
βββ README.md # This file
```
## π API Endpoints
### REST API
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/health` | Health check with WebSocket connection count |
| GET | `/tasks` | Get all active tasks |
| GET | `/tasks/{task_id}` | Get specific task status |
| GET | `/docs` | Interactive API documentation (Swagger) |
| GET | `/redoc` | Alternative API documentation (ReDoc) |
### WebSocket
#### Client β Server Events
- `user_task` - New user task request
#### Server β Client Events
- `agent_start` - Agent begins processing
- `agent_progress` - New step completed with image and metadata
- `agent_complete` - Task finished successfully
- `agent_error` - Error occurred during processing
- `vnc_url_set` - VNC stream URL available
- `vnc_url_unset` - VNC stream ended
- `heartbeat` - Connection keep-alive
## π§ͺ Development
### Available Make Commands
```bash
make sync # Sync all dependencies (Python + Node.js)
make dev-backend # Start backend development server
make dev-frontend # Start frontend development server
make pre-commit # Run pre-commit hooks
make clean # Clean build artifacts and caches
```
### Code Quality
```bash
# Backend
make pre-commit
```
**Happy Coding! π**
|