Spaces:
Running
CUA2 - Computer Use Agent 2
An AI-powered automation interface featuring real-time agent task processing, VNC streaming, and step-by-step execution visualization.
π Overview
CUA2 is a full-stack application that provides a modern web interface for AI agents to perform automated computer tasks. The system features real-time WebSocket communication between a FastAPI backend and React frontend, allowing users to monitor agent execution, view screenshots, track token usage, and stream VNC sessions.
ποΈ Architecture
π οΈ Tech Stack
Backend (cua2-core)
- FastAPI
- Uvicorn
- smolagents - AI agent framework with OpenAI/LiteLLM support
Frontend (cua2-front)
- React TS
- Vite
π Prerequisites
- Python 3.10 or higher
- Node.js 18 or higher
- npm
- uv - Python package manager
Installing uv
macOS/Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
For more installation options, visit: https://docs.astral.sh/uv/getting-started/installation/
π Getting Started
1. Clone the Repository
git clone https://github.com/huggingface/CUA2.git
cd CUA2
2. Install Dependencies
Use the Makefile for quick setup:
make sync
This will:
- Install Python dependencies using
uv - Install Node.js dependencies for the frontend
Or install manually:
# Backend dependencies
cd cua2-core
uv sync --all-extras
# Frontend dependencies
cd ../cua2-front
npm install
3. Environment Configuration
Copy the example environment file and configure your settings:
cd cua2-core
cp env.example .env
Edit .env with your configuration:
- API keys for OpenAI/LiteLLM
- Database connections (if applicable)
- Other service credentials
4. Start Development Servers
Option 1: Using Makefile (Recommended)
Open two terminal windows:
Terminal 1 - Backend:
make dev-backend
Terminal 2 - Frontend:
make dev-frontend
Option 2: Manual Start
Terminal 1 - Backend:
cd cua2-core
uv run uvicorn cua2_core.main:app --reload --host 0.0.0.0 --port 8000
Terminal 2 - Frontend:
cd cua2-front
npm run dev
5. Access the Application
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
π Project Structure
CUA2/
βββ cua2-core/ # Backend application
β βββ src/
β β βββ cua2_core/
β β βββ app.py # FastAPI application setup
β β βββ main.py # Application entry point
β β βββ models/
β β β βββ models.py # Pydantic models
β β βββ routes/
β β β βββ routes.py # REST API endpoints
β β β βββ websocket.py # WebSocket endpoint
β β βββ services/
β β β βββ agent_service.py # Agent task processing
β β β βββ simulation_metadata/ # Demo data
β β βββ websocket/
β β βββ websocket_manager.py # WebSocket management
β βββ pyproject.toml # Python dependencies
β βββ env.example # Environment variables template
β
βββ cua2-front/ # Frontend application
β βββ src/
β β βββ App.tsx # Main application component
β β βββ pages/
β β β βββ Index.tsx # Main page
β β βββ components/
β β β βββ mock/ # UI components
β β βββ hooks/
β β β βββ useWebSocket.ts # WebSocket hook
β β βββ types/
β β βββ agent.ts # TypeScript type definitions
β βββ package.json # Node dependencies
β βββ vite.config.ts # Vite configuration
β
βββ Makefile # Development commands
βββ README.md # This file
π API Endpoints
REST API
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check with WebSocket connection count |
| GET | /tasks |
Get all active tasks |
| GET | /tasks/{task_id} |
Get specific task status |
| GET | /docs |
Interactive API documentation (Swagger) |
| GET | /redoc |
Alternative API documentation (ReDoc) |
WebSocket
Client β Server Events
user_task- New user task request
Server β Client Events
agent_start- Agent begins processingagent_progress- New step completed with image and metadataagent_complete- Task finished successfullyagent_error- Error occurred during processingvnc_url_set- VNC stream URL availablevnc_url_unset- VNC stream endedheartbeat- Connection keep-alive
π§ͺ Development
Available Make Commands
make sync # Sync all dependencies (Python + Node.js)
make dev-backend # Start backend development server
make dev-frontend # Start frontend development server
make pre-commit # Run pre-commit hooks
make clean # Clean build artifacts and caches
Code Quality
# Backend
make pre-commit
Build for Production
# Frontend
cd cua2-front
npm run build
# The build output will be in cua2-front/dist/
Happy Coding! π
