Robin L. M. Cheung, MBA commited on
Commit
01504c4
·
1 Parent(s): 0b57493

feat: Add local CUDA support, MCP server, Spaces GPU selection, and stacking roadmap

Browse files

- Remove ZeroGPU dependency, optimize for local CUDA (4090/3090/3070ti)
- Add MCP server (mcp_server.py) with sharp_predict, list_outputs tools
- Add hardware_config.py for Spaces GPU selection with persistence
- Add Settings tab in Gradio UI for hardware configuration
- Support all HuggingFace Spaces GPUs (ZeroGPU through A100)
- Enable Gradio API by default (show_api=True)
- Add comprehensive WARP.md with codebase map and documentation
- Complete multi-image stacking roadmap with implementation phases

New files:
- WARP.md: Project guidance for WARP/AI assistants
- mcp_server.py: MCP server for programmatic access
- hardware_config.py: GPU hardware selection module

Environment:
- SHARP_PORT (default: 49200) for Gradio
- SHARP_MCP_PORT (default: 49201) for MCP
- CUDA_VISIBLE_DEVICES for multi-GPU selection

Files changed (8) hide show
  1. .gitignore +1 -0
  2. WARP.md +344 -0
  3. app.py +145 -3
  4. hardware_config.py +252 -0
  5. mcp_server.py +224 -0
  6. model_utils.py +71 -20
  7. pyproject.toml +2 -1
  8. requirements.txt +2 -1
.gitignore CHANGED
@@ -217,3 +217,4 @@ __marimo__/
217
 
218
  # Kilo Code
219
  .kilocode/
 
 
217
 
218
  # Kilo Code
219
  .kilocode/
220
+ .hardware_config.json
WARP.md ADDED
@@ -0,0 +1,344 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # WARP.md
2
+
3
+ This file provides guidance to WARP (warp.dev) when working with code in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ SHARP (Single-image 3D Gaussian scene prediction) Gradio demo. Wraps Apple's SHARP model to predict 3D Gaussian scenes from single images, export `.ply` files, and optionally render camera trajectory videos.
8
+
9
+ Optimized for local CUDA (4090/3090/3070ti) or HuggingFace Spaces GPU. Includes MCP server for programmatic access.
10
+
11
+ ## Development Commands
12
+
13
+ ```bash
14
+ # Install dependencies (uses uv package manager)
15
+ uv sync
16
+
17
+ # Run the Gradio app (port 49200 by default)
18
+ uv run python app.py
19
+
20
+ # Run MCP server (stdio transport)
21
+ uv run python mcp_server.py
22
+
23
+ # Lint with ruff
24
+ uv run ruff check .
25
+ uv run ruff format .
26
+ ```
27
+
28
+ ## Codebase Map
29
+
30
+ ```
31
+ ml-sharp/
32
+ ├── app.py # Gradio UI (tabs: Run, Examples, About, Settings)
33
+ │ ├── build_demo() # Main UI builder
34
+ │ ├── run_sharp() # Inference entrypoint called by UI
35
+ │ └── discover_examples() # Load precompiled examples
36
+ ├── model_utils.py # Core inference + rendering
37
+ │ ├── ModelWrapper # Checkpoint loading, predictor caching
38
+ │ │ ├── predict_to_ply() # Image → Gaussians → PLY
39
+ │ │ └── render_video() # Gaussians → MP4 trajectory
40
+ │ ├── PredictionOutputs # Dataclass for inference results
41
+ │ ├── configure_gpu_mode() # Switch between local/Spaces GPU
42
+ │ └── predict_and_maybe_render_gpu # Module-level entrypoint
43
+ ├── hardware_config.py # GPU hardware selection & persistence
44
+ │ ├── HardwareConfig # Dataclass with mode, hardware, duration
45
+ │ ├── get_hardware_choices() # Dropdown options
46
+ │ └── SPACES_HARDWARE_SPECS # HF Spaces GPU specs & pricing
47
+ ├── mcp_server.py # MCP server for programmatic access
48
+ │ ├── sharp_predict # Tool: image → PLY + video
49
+ │ ├── list_outputs # Tool: list generated files
50
+ │ └── sharp://info # Resource: GPU status, config
51
+ ├── assets/examples/ # Precompiled example outputs
52
+ ├── outputs/ # Runtime outputs (PLY, MP4)
53
+ ├── .hardware_config.json # Persisted hardware settings
54
+ ├── pyproject.toml # Dependencies (uv)
55
+ └── WARP.md # This file
56
+ ```
57
+
58
+ ### Data Flow
59
+
60
+ ```
61
+ Image → load_rgb() → predict_image() → Gaussians3D → save_ply() → PLY
62
+
63
+ render_video() → MP4
64
+ ```
65
+
66
+ ## Architecture
67
+
68
+ ### Core Files
69
+
70
+ - `app.py` — Gradio UI with tabs for Run/Examples/About/Settings. Handles example discovery from `assets/examples/` via manifest.json or filename conventions.
71
+ - `model_utils.py` — SHARP model wrapper with checkpoint loading (HF Hub → CDN fallback), inference via `predict_to_ply()`, and CUDA video rendering via `render_video()`.
72
+ - `hardware_config.py` — GPU hardware selection between local CUDA and HuggingFace Spaces. Persists to `.hardware_config.json`.
73
+ - `mcp_server.py` — MCP server exposing `sharp_predict` tool and `sharp://info` resource.
74
+
75
+ ### Key Patterns
76
+
77
+ **Local CUDA mode**: Model kept on GPU by default (`SHARP_KEEP_MODEL_ON_DEVICE=1`) for better performance on dedicated GPUs.
78
+
79
+ **Spaces GPU mode**: Uses `@spaces.GPU` decorator for dynamic GPU allocation on HuggingFace Spaces. Configurable via Settings tab.
80
+
81
+ **Checkpoint resolution order**:
82
+ 1. `SHARP_CHECKPOINT_PATH` env var
83
+ 2. HF Hub cache
84
+ 3. HF Hub download
85
+ 4. Upstream CDN via `torch.hub`
86
+
87
+ **Video rendering**: Requires CUDA (gsplat). Falls back gracefully on CPU-only systems by returning `None` for video path.
88
+
89
+ ## Environment Variables
90
+
91
+ | Variable | Default | Description |
92
+ |----------|---------|-------------|
93
+ | `SHARP_PORT` | `49200` | Gradio server port |
94
+ | `SHARP_MCP_PORT` | `49201` | MCP server port |
95
+ | `SHARP_CHECKPOINT_PATH` | — | Override local checkpoint path |
96
+ | `SHARP_HF_REPO_ID` | `apple/Sharp` | HuggingFace repo |
97
+ | `SHARP_HF_FILENAME` | `sharp_2572gikvuh.pt` | Checkpoint filename |
98
+ | `SHARP_KEEP_MODEL_ON_DEVICE` | `1` | Keep model on GPU (set `0` to free VRAM) |
99
+ | `CUDA_VISIBLE_DEVICES` | — | GPU selection (e.g., `0` or `0,1`) |
100
+
101
+ ## Gradio API
102
+
103
+ API is enabled by default. Access at `http://localhost:49200/?view=api`.
104
+
105
+ ### Endpoint: `/api/run_sharp`
106
+
107
+ ```python
108
+ import requests
109
+
110
+ response = requests.post(
111
+ "http://localhost:49200/api/run_sharp",
112
+ json={
113
+ "data": [
114
+ "/path/to/image.jpg", # image_path
115
+ "rotate_forward", # trajectory_type
116
+ 0, # output_long_side (0 = match input)
117
+ 60, # num_frames
118
+ 30, # fps
119
+ True, # render_video
120
+ ]
121
+ }
122
+ )
123
+ result = response.json()["data"]
124
+ video_path, ply_path, status = result
125
+ ```
126
+
127
+ ## MCP Server
128
+
129
+ Run the MCP server for integration with AI agents:
130
+
131
+ ```bash
132
+ uv run python mcp_server.py
133
+ ```
134
+
135
+ ### MCP Config (for clients like Warp)
136
+
137
+ ```json
138
+ {
139
+ "mcpServers": {
140
+ "sharp": {
141
+ "command": "uv",
142
+ "args": ["run", "python", "mcp_server.py"],
143
+ "cwd": "/home/robin/CascadeProjects/ml-sharp"
144
+ }
145
+ }
146
+ }
147
+ ```
148
+
149
+ ### Tools
150
+
151
+ - `sharp_predict(image_path, render_video=True, trajectory_type="rotate_forward", ...)` — Run inference
152
+ - `list_outputs()` — List generated PLY/MP4 files
153
+
154
+ ### Resources
155
+
156
+ - `sharp://info` — GPU status, configuration
157
+ - `sharp://help` — Usage documentation
158
+
159
+ ## Multi-GPU Configuration
160
+
161
+ Select GPU via environment variable:
162
+
163
+ ```bash
164
+ # Use GPU 0 (e.g., 4090)
165
+ CUDA_VISIBLE_DEVICES=0 uv run python app.py
166
+
167
+ # Use GPU 1 (e.g., 3090)
168
+ CUDA_VISIBLE_DEVICES=1 uv run python app.py
169
+ ```
170
+
171
+ ## HuggingFace Spaces GPU
172
+
173
+ The app supports HuggingFace Spaces paid GPUs for faster inference or larger models. Configure via the **Settings** tab.
174
+
175
+ ### Available Hardware
176
+
177
+ | Hardware | VRAM | Price/hr | Best For |
178
+ |----------|------|----------|----------|
179
+ | ZeroGPU (H200) | 70GB | Free (PRO) | Demos, dynamic allocation |
180
+ | T4 small | 16GB | $0.40 | Light workloads |
181
+ | T4 medium | 16GB | $0.60 | Standard workloads |
182
+ | L4x1 | 24GB | $0.80 | Standard inference |
183
+ | L4x4 | 96GB | $3.80 | Multi-GPU |
184
+ | L40Sx1 | 48GB | $1.80 | Large models |
185
+ | L40Sx4 | 192GB | $8.30 | Very large models |
186
+ | A10G small | 24GB | $1.00 | Balanced |
187
+ | A10G large | 24GB | $1.50 | More CPU/RAM |
188
+ | A100 large | 80GB | $2.50 | Maximum VRAM |
189
+
190
+ ### Deploying to Spaces
191
+
192
+ 1. Push to HuggingFace Space
193
+ 2. Set hardware in Space settings (or use `suggested_hardware` in README.md)
194
+ 3. The app auto-detects Spaces environment via `SPACE_ID` env var
195
+
196
+ ### README.md Metadata for Spaces
197
+
198
+ ```yaml
199
+ ---
200
+ title: SHARP - 3D Gaussian Scene Prediction
201
+ emoji: 🔪
202
+ colorFrom: purple
203
+ colorTo: indigo
204
+ sdk: gradio
205
+ sdk_version: 6.2.0
206
+ python_version: 3.13.11
207
+ app_file: app.py
208
+ suggested_hardware: l4x1 # or zero-gpu, a100-large, etc.
209
+ startup_duration_timeout: 1h
210
+ preload_from_hub:
211
+ - apple/Sharp sharp_2572gikvuh.pt
212
+ ---
213
+ ```
214
+
215
+ ## Examples System
216
+
217
+ Place precompiled outputs in `assets/examples/`:
218
+ - `<name>.{jpg,png,webp}` + `<name>.mp4` + `<name>.ply`
219
+ - Or define `assets/examples/manifest.json` with `{label, image, video, ply}` entries
220
+
221
+ ## Multi-Image Stacking Roadmap
222
+
223
+ SHARP predicts 3D Gaussians from a single image. To "stack" multiple images into a unified scene:
224
+
225
+ ### Required Components
226
+
227
+ 1. **Pose Estimation** (`multi_view.py`)
228
+ - Estimate relative camera poses between images
229
+ - Options: COLMAP, hloc, or PnP-based
230
+ - Transform each prediction to common world frame
231
+
232
+ 2. **Gaussian Merging** (`gaussian_merge.py`)
233
+ - Concatenate Gaussian parameters (means, covariances, colors, opacities)
234
+ - Deduplicate overlapping regions via density-based filtering
235
+ - Optional: fine-tune merged scene with photometric loss
236
+
237
+ 3. **UI Changes**
238
+ - Multi-upload widget
239
+ - Alignment preview/validation
240
+ - Progress indicator for multi-image processing
241
+
242
+ ### Data Structures
243
+
244
+ ```python
245
+ @dataclass
246
+ class AlignedGaussians:
247
+ gaussians: Gaussians3D
248
+ world_transform: torch.Tensor # 4x4 SE(3)
249
+ source_image: Path
250
+
251
+ def merge_gaussians(aligned: list[AlignedGaussians]) -> Gaussians3D:
252
+ # 1. Transform each Gaussian's means by world_transform
253
+ # 2. Concatenate all parameters
254
+ # 3. Density-based pruning in overlapping regions
255
+ ...
256
+ ```
257
+
258
+ ### Dependencies to Add
259
+
260
+ - `pycolmap` or `hloc` for pose estimation
261
+ - `open3d` for point cloud operations (optional)
262
+
263
+ ### Implementation Phases
264
+
265
+ #### Phase 1: Basic Multi-Image Pipeline
266
+ - [ ] Add `multi_view.py` with `estimate_relative_pose(img1, img2)` using feature matching
267
+ - [ ] Add `gaussian_merge.py` with naive concatenation (no dedup)
268
+ - [ ] UI: Multi-file upload in new "Stack" tab
269
+ - [ ] Export merged PLY
270
+
271
+ #### Phase 2: Pose Estimation Options
272
+ - [ ] Integrate COLMAP sparse reconstruction for >2 images
273
+ - [ ] Add hloc (Hierarchical Localization) as lightweight alternative
274
+ - [ ] Fallback: manual pose input for known camera rigs
275
+
276
+ #### Phase 3: Gaussian Deduplication
277
+ - [ ] Implement KD-tree based nearest-neighbor pruning
278
+ - [ ] Merge overlapping Gaussians by averaging parameters
279
+ - [ ] Add confidence weighting based on view angle
280
+
281
+ #### Phase 4: Refinement (Optional)
282
+ - [ ] Photometric loss optimization on merged scene
283
+ - [ ] Iterative alignment refinement
284
+ - [ ] Support for depth priors from stereo/MVS
285
+
286
+ ### API Design
287
+
288
+ ```python
289
+ # multi_view.py
290
+ def estimate_poses(
291
+ images: list[Path],
292
+ method: Literal["colmap", "hloc", "pnp"] = "hloc",
293
+ ) -> list[np.ndarray]: # List of 4x4 world-to-camera transforms
294
+ ...
295
+
296
+ # gaussian_merge.py
297
+ def merge_scenes(
298
+ predictions: list[PredictionOutputs],
299
+ poses: list[np.ndarray],
300
+ deduplicate: bool = True,
301
+ dedup_radius: float = 0.01, # meters
302
+ ) -> Gaussians3D:
303
+ ...
304
+
305
+ # app.py (Stack tab)
306
+ def run_stack(
307
+ images: list[str], # Gradio multi-file upload
308
+ pose_method: str,
309
+ deduplicate: bool,
310
+ ) -> tuple[str | None, str | None, str]: # video, ply, status
311
+ ...
312
+ ```
313
+
314
+ ### MCP Extension
315
+
316
+ ```python
317
+ # mcp_server.py additions
318
+ @mcp.tool()
319
+ def sharp_stack(
320
+ image_paths: list[str],
321
+ pose_method: str = "hloc",
322
+ deduplicate: bool = True,
323
+ render_video: bool = True,
324
+ ) -> dict:
325
+ """Stack multiple images into unified 3D Gaussian scene."""
326
+ ...
327
+ ```
328
+
329
+ ### Technical Considerations
330
+
331
+ **Coordinate Systems**:
332
+ - SHARP outputs Gaussians in camera-centric coordinates
333
+ - Need to transform to world frame using estimated poses
334
+ - Convention: Y-up, -Z forward (OpenGL style)
335
+
336
+ **Memory Management**:
337
+ - Each SHARP prediction ~50-200MB GPU memory
338
+ - Batch processing with model unload between predictions
339
+ - Consider streaming merge for >10 images
340
+
341
+ **Quality Metrics**:
342
+ - Reprojection error for pose validation
343
+ - Gaussian density histogram for coverage analysis
344
+ - Visual comparison with ground truth (if available)
app.py CHANGED
@@ -29,7 +29,22 @@ from typing import Final
29
 
30
  import gradio as gr
31
 
32
- from model_utils import TrajectoryType, predict_and_maybe_render_gpu
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  # -----------------------------------------------------------------------------
35
  # Paths & constants
@@ -42,6 +57,7 @@ EXAMPLES_DIR: Final[Path] = ASSETS_DIR / "examples"
42
 
43
  IMAGE_EXTS: Final[tuple[str, ...]] = (".png", ".jpg", ".jpeg", ".webp")
44
  DEFAULT_QUEUE_MAX_SIZE: Final[int] = 32
 
45
 
46
  THEME: Final = gr.themes.Soft(
47
  primary_hue="indigo",
@@ -239,6 +255,68 @@ def _validate_image(image_path: str | None) -> None:
239
  raise gr.Error("Upload an image first.")
240
 
241
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
242
  def run_sharp(
243
  image_path: str | None,
244
  trajectory_type: TrajectoryType,
@@ -354,7 +432,7 @@ def build_demo() -> gr.Blocks:
354
  )
355
 
356
  render_toggle = gr.Checkbox(
357
- label="Render MP4 (CUDA / ZeroGPU only)",
358
  value=True,
359
  )
360
 
@@ -490,6 +568,65 @@ def build_demo() -> gr.Blocks:
490
  """.strip()
491
  )
492
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
493
  demo.queue(max_size=DEFAULT_QUEUE_MAX_SIZE, default_concurrency_limit=1)
494
  return demo
495
 
@@ -497,4 +634,9 @@ def build_demo() -> gr.Blocks:
497
  demo = build_demo()
498
 
499
  if __name__ == "__main__":
500
- demo.launch(theme=THEME, css=CSS)
 
 
 
 
 
 
29
 
30
  import gradio as gr
31
 
32
+ import os
33
+
34
+ from model_utils import (
35
+ TrajectoryType,
36
+ predict_and_maybe_render_gpu,
37
+ configure_gpu_mode,
38
+ get_gpu_status,
39
+ )
40
+ from hardware_config import (
41
+ get_hardware_choices,
42
+ parse_hardware_choice,
43
+ get_config,
44
+ update_config,
45
+ SPACES_HARDWARE_SPECS,
46
+ is_running_on_spaces,
47
+ )
48
 
49
  # -----------------------------------------------------------------------------
50
  # Paths & constants
 
57
 
58
  IMAGE_EXTS: Final[tuple[str, ...]] = (".png", ".jpg", ".jpeg", ".webp")
59
  DEFAULT_QUEUE_MAX_SIZE: Final[int] = 32
60
+ DEFAULT_PORT: Final[int] = int(os.getenv("SHARP_PORT", "49200"))
61
 
62
  THEME: Final = gr.themes.Soft(
63
  primary_hue="indigo",
 
255
  raise gr.Error("Upload an image first.")
256
 
257
 
258
+ # -----------------------------------------------------------------------------
259
+ # Hardware Configuration
260
+ # -----------------------------------------------------------------------------
261
+
262
+
263
+ def _get_current_hardware_value() -> str:
264
+ """Get current hardware choice value for dropdown."""
265
+ config = get_config()
266
+ if config.mode == "local":
267
+ return "local"
268
+ return f"spaces:{config.spaces_hardware}"
269
+
270
+
271
+ def _format_gpu_status() -> str:
272
+ """Format GPU status as markdown."""
273
+ status = get_gpu_status()
274
+ config = get_config()
275
+
276
+ lines = ["### Current Status"]
277
+ lines.append(f"- **Mode:** {'Local CUDA' if config.mode == 'local' else 'HuggingFace Spaces'}")
278
+
279
+ if config.mode == "spaces":
280
+ hw_spec = SPACES_HARDWARE_SPECS.get(config.spaces_hardware, {})
281
+ lines.append(f"- **Spaces Hardware:** {hw_spec.get('name', config.spaces_hardware)}")
282
+ lines.append(f"- **VRAM:** {hw_spec.get('vram', 'N/A')}")
283
+ lines.append(f"- **Price:** {hw_spec.get('price', 'N/A')}")
284
+ lines.append(f"- **Duration:** {config.spaces_duration}s")
285
+ else:
286
+ lines.append(f"- **CUDA Available:** {'✅ Yes' if status['cuda_available'] else '❌ No'}")
287
+ lines.append(f"- **Spaces Module:** {'✅ Installed' if status['spaces_available'] else '❌ Not installed'}")
288
+
289
+ if status['devices']:
290
+ lines.append("\n### Local GPUs")
291
+ for dev in status['devices']:
292
+ lines.append(f"- **GPU {dev['index']}:** {dev['name']} ({dev['total_memory_gb']}GB)")
293
+
294
+ if is_running_on_spaces():
295
+ lines.append("\n⚠️ *Running on HuggingFace Spaces*")
296
+
297
+ return "\n".join(lines)
298
+
299
+
300
+ def _apply_hardware_config(choice: str, duration: int) -> str:
301
+ """Apply hardware configuration and return status."""
302
+ mode, spaces_hw = parse_hardware_choice(choice)
303
+
304
+ # Update config
305
+ update_config(
306
+ mode=mode,
307
+ spaces_hardware=spaces_hw if spaces_hw else "zero-gpu",
308
+ spaces_duration=duration,
309
+ )
310
+
311
+ # Configure GPU mode in model_utils
312
+ configure_gpu_mode(
313
+ use_spaces=(mode == "spaces"),
314
+ duration=duration,
315
+ )
316
+
317
+ return _format_gpu_status()
318
+
319
+
320
  def run_sharp(
321
  image_path: str | None,
322
  trajectory_type: TrajectoryType,
 
432
  )
433
 
434
  render_toggle = gr.Checkbox(
435
+ label="Render MP4 (requires CUDA)",
436
  value=True,
437
  )
438
 
 
568
  """.strip()
569
  )
570
 
571
+ with gr.Tab("⚙️ Settings", id="settings"):
572
+ with gr.Column(elem_id="settings-panel"):
573
+ gr.Markdown("### GPU Hardware Selection")
574
+ gr.Markdown(
575
+ "Select local CUDA or HuggingFace Spaces GPU for inference. "
576
+ "Spaces GPUs require deploying to HuggingFace Spaces."
577
+ )
578
+
579
+ with gr.Row():
580
+ with gr.Column(scale=3):
581
+ hw_dropdown = gr.Dropdown(
582
+ label="Hardware",
583
+ choices=get_hardware_choices(),
584
+ value=_get_current_hardware_value(),
585
+ interactive=True,
586
+ )
587
+
588
+ duration_slider = gr.Slider(
589
+ label="Spaces GPU Duration (seconds)",
590
+ info="Max time for @spaces.GPU decorator (ZeroGPU only)",
591
+ minimum=60,
592
+ maximum=300,
593
+ step=30,
594
+ value=get_config().spaces_duration,
595
+ interactive=True,
596
+ )
597
+
598
+ apply_btn = gr.Button("Apply & Save", variant="primary")
599
+
600
+ with gr.Column(scale=2):
601
+ hw_status = gr.Markdown(
602
+ value=_format_gpu_status(),
603
+ elem_id="hw-status",
604
+ )
605
+
606
+ apply_btn.click(
607
+ fn=_apply_hardware_config,
608
+ inputs=[hw_dropdown, duration_slider],
609
+ outputs=[hw_status],
610
+ )
611
+
612
+ gr.Markdown(
613
+ """
614
+ ---
615
+ ### Spaces Hardware Reference
616
+
617
+ | Hardware | VRAM | Price | Best For |
618
+ |----------|------|-------|----------|
619
+ | ZeroGPU (H200) | 70GB | Free (PRO) | Demos, dynamic allocation |
620
+ | T4 small/medium | 16GB | $0.40-0.60/hr | Light workloads |
621
+ | L4x1 | 24GB | $0.80/hr | Standard inference |
622
+ | L40Sx1 | 48GB | $1.80/hr | Large models |
623
+ | A10G large | 24GB | $1.50/hr | Balanced cost/performance |
624
+ | A100 large | 80GB | $2.50/hr | Maximum VRAM |
625
+
626
+ *Prices as of Dec 2024. See [HuggingFace Spaces GPU docs](https://huggingface.co/docs/hub/spaces-gpus).*
627
+ """
628
+ )
629
+
630
  demo.queue(max_size=DEFAULT_QUEUE_MAX_SIZE, default_concurrency_limit=1)
631
  return demo
632
 
 
634
  demo = build_demo()
635
 
636
  if __name__ == "__main__":
637
+ demo.launch(
638
+ theme=THEME,
639
+ css=CSS,
640
+ server_port=DEFAULT_PORT,
641
+ show_api=True,
642
+ )
hardware_config.py ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Hardware configuration for local CUDA and HuggingFace Spaces GPU selection.
2
+
3
+ This module provides:
4
+ - Hardware mode selection (local CUDA vs Spaces GPU)
5
+ - Persistent configuration via JSON file
6
+ - HuggingFace Spaces GPU hardware options
7
+
8
+ Spaces GPU pricing (as of Dec 2024):
9
+ - ZeroGPU (H200): Free (PRO subscribers), dynamic allocation
10
+ - T4-small: $0.40/hr, 16GB VRAM
11
+ - T4-medium: $0.60/hr, 16GB VRAM
12
+ - L4x1: $0.80/hr, 24GB VRAM
13
+ - L4x4: $3.80/hr, 96GB VRAM
14
+ - L40Sx1: $1.80/hr, 48GB VRAM
15
+ - L40Sx4: $8.30/hr, 192GB VRAM
16
+ - A10G-small: $1.00/hr, 24GB VRAM
17
+ - A10G-large: $1.50/hr, 24GB VRAM
18
+ - A100-large: $2.50/hr, 80GB VRAM
19
+ """
20
+
21
+ from __future__ import annotations
22
+
23
+ import json
24
+ import os
25
+ from dataclasses import dataclass, field
26
+ from pathlib import Path
27
+ from typing import Final, Literal
28
+
29
+ # Hardware mode: local CUDA or HuggingFace Spaces
30
+ HardwareMode = Literal["local", "spaces"]
31
+
32
+ # Spaces hardware flavors (from HF docs)
33
+ SpacesHardware = Literal[
34
+ "zero-gpu", # ZeroGPU (H200, dynamic, free for PRO)
35
+ "t4-small", # Nvidia T4 small
36
+ "t4-medium", # Nvidia T4 medium
37
+ "l4x1", # 1x Nvidia L4
38
+ "l4x4", # 4x Nvidia L4
39
+ "l40s-x1", # 1x Nvidia L40S
40
+ "l40s-x4", # 4x Nvidia L40S
41
+ "a10g-small", # Nvidia A10G small
42
+ "a10g-large", # Nvidia A10G large
43
+ "a10g-largex2", # 2x Nvidia A10G large
44
+ "a10g-largex4", # 4x Nvidia A10G large
45
+ "a100-large", # Nvidia A100 large (80GB)
46
+ ]
47
+
48
+ # Hardware specs for display
49
+ SPACES_HARDWARE_SPECS: Final[dict[str, dict]] = {
50
+ "zero-gpu": {
51
+ "name": "ZeroGPU (H200)",
52
+ "vram": "70GB",
53
+ "price": "Free (PRO)",
54
+ "description": "Dynamic allocation, best for demos",
55
+ },
56
+ "t4-small": {
57
+ "name": "Nvidia T4 small",
58
+ "vram": "16GB",
59
+ "price": "$0.40/hr",
60
+ "description": "4 vCPU, 15GB RAM",
61
+ },
62
+ "t4-medium": {
63
+ "name": "Nvidia T4 medium",
64
+ "vram": "16GB",
65
+ "price": "$0.60/hr",
66
+ "description": "8 vCPU, 30GB RAM",
67
+ },
68
+ "l4x1": {
69
+ "name": "1x Nvidia L4",
70
+ "vram": "24GB",
71
+ "price": "$0.80/hr",
72
+ "description": "8 vCPU, 30GB RAM",
73
+ },
74
+ "l4x4": {
75
+ "name": "4x Nvidia L4",
76
+ "vram": "96GB",
77
+ "price": "$3.80/hr",
78
+ "description": "48 vCPU, 186GB RAM",
79
+ },
80
+ "l40s-x1": {
81
+ "name": "1x Nvidia L40S",
82
+ "vram": "48GB",
83
+ "price": "$1.80/hr",
84
+ "description": "8 vCPU, 62GB RAM",
85
+ },
86
+ "l40s-x4": {
87
+ "name": "4x Nvidia L40S",
88
+ "vram": "192GB",
89
+ "price": "$8.30/hr",
90
+ "description": "48 vCPU, 382GB RAM",
91
+ },
92
+ "a10g-small": {
93
+ "name": "Nvidia A10G small",
94
+ "vram": "24GB",
95
+ "price": "$1.00/hr",
96
+ "description": "4 vCPU, 14GB RAM",
97
+ },
98
+ "a10g-large": {
99
+ "name": "Nvidia A10G large",
100
+ "vram": "24GB",
101
+ "price": "$1.50/hr",
102
+ "description": "12 vCPU, 46GB RAM",
103
+ },
104
+ "a10g-largex2": {
105
+ "name": "2x Nvidia A10G large",
106
+ "vram": "48GB",
107
+ "price": "$3.00/hr",
108
+ "description": "24 vCPU, 92GB RAM",
109
+ },
110
+ "a10g-largex4": {
111
+ "name": "4x Nvidia A10G large",
112
+ "vram": "96GB",
113
+ "price": "$5.00/hr",
114
+ "description": "48 vCPU, 184GB RAM",
115
+ },
116
+ "a100-large": {
117
+ "name": "Nvidia A100 large",
118
+ "vram": "80GB",
119
+ "price": "$2.50/hr",
120
+ "description": "12 vCPU, 142GB RAM, best for large models",
121
+ },
122
+ }
123
+
124
+ CONFIG_FILE: Final[Path] = Path(__file__).resolve().parent / ".hardware_config.json"
125
+
126
+
127
+ @dataclass
128
+ class HardwareConfig:
129
+ """Persistent hardware configuration."""
130
+
131
+ mode: HardwareMode = "local"
132
+ spaces_hardware: SpacesHardware = "zero-gpu"
133
+ spaces_duration: int = 180 # seconds for @spaces.GPU decorator
134
+ local_device: str = "auto" # auto, cuda, cpu, mps
135
+ keep_model_on_device: bool = True
136
+
137
+ def to_dict(self) -> dict:
138
+ return {
139
+ "mode": self.mode,
140
+ "spaces_hardware": self.spaces_hardware,
141
+ "spaces_duration": self.spaces_duration,
142
+ "local_device": self.local_device,
143
+ "keep_model_on_device": self.keep_model_on_device,
144
+ }
145
+
146
+ @classmethod
147
+ def from_dict(cls, data: dict) -> "HardwareConfig":
148
+ return cls(
149
+ mode=data.get("mode", "local"),
150
+ spaces_hardware=data.get("spaces_hardware", "zero-gpu"),
151
+ spaces_duration=data.get("spaces_duration", 180),
152
+ local_device=data.get("local_device", "auto"),
153
+ keep_model_on_device=data.get("keep_model_on_device", True),
154
+ )
155
+
156
+ def save(self, path: Path = CONFIG_FILE) -> None:
157
+ """Save configuration to JSON file."""
158
+ path.write_text(json.dumps(self.to_dict(), indent=2))
159
+
160
+ @classmethod
161
+ def load(cls, path: Path = CONFIG_FILE) -> "HardwareConfig":
162
+ """Load configuration from JSON file, or return defaults."""
163
+ if path.exists():
164
+ try:
165
+ data = json.loads(path.read_text())
166
+ return cls.from_dict(data)
167
+ except Exception:
168
+ pass
169
+ return cls()
170
+
171
+
172
+ def get_hardware_choices() -> list[tuple[str, str]]:
173
+ """Get hardware choices for Gradio dropdown.
174
+
175
+ Returns list of (display_name, value) tuples.
176
+ """
177
+ choices = [
178
+ ("🖥️ Local CUDA (auto-detect)", "local"),
179
+ ]
180
+
181
+ for hw_id, spec in SPACES_HARDWARE_SPECS.items():
182
+ label = f"☁️ {spec['name']} - {spec['vram']} VRAM ({spec['price']})"
183
+ choices.append((label, f"spaces:{hw_id}"))
184
+
185
+ return choices
186
+
187
+
188
+ def parse_hardware_choice(choice: str) -> tuple[HardwareMode, SpacesHardware | None]:
189
+ """Parse hardware choice string into mode and hardware type."""
190
+ if choice == "local":
191
+ return "local", None
192
+ elif choice.startswith("spaces:"):
193
+ hw = choice.replace("spaces:", "")
194
+ return "spaces", hw # type: ignore
195
+ else:
196
+ return "local", None
197
+
198
+
199
+ def is_running_on_spaces() -> bool:
200
+ """Check if we're running on HuggingFace Spaces."""
201
+ return os.getenv("SPACE_ID") is not None
202
+
203
+
204
+ def get_spaces_module():
205
+ """Import and return the spaces module if available."""
206
+ try:
207
+ import spaces
208
+ return spaces
209
+ except ImportError:
210
+ return None
211
+
212
+
213
+ # Global config instance
214
+ _config: HardwareConfig | None = None
215
+
216
+
217
+ def get_config() -> HardwareConfig:
218
+ """Get the global hardware configuration."""
219
+ global _config
220
+ if _config is None:
221
+ _config = HardwareConfig.load()
222
+ return _config
223
+
224
+
225
+ def update_config(
226
+ mode: HardwareMode | None = None,
227
+ spaces_hardware: SpacesHardware | None = None,
228
+ spaces_duration: int | None = None,
229
+ local_device: str | None = None,
230
+ keep_model_on_device: bool | None = None,
231
+ save: bool = True,
232
+ ) -> HardwareConfig:
233
+ """Update and optionally save the hardware configuration."""
234
+ global _config
235
+ config = get_config()
236
+
237
+ if mode is not None:
238
+ config.mode = mode
239
+ if spaces_hardware is not None:
240
+ config.spaces_hardware = spaces_hardware
241
+ if spaces_duration is not None:
242
+ config.spaces_duration = spaces_duration
243
+ if local_device is not None:
244
+ config.local_device = local_device
245
+ if keep_model_on_device is not None:
246
+ config.keep_model_on_device = keep_model_on_device
247
+
248
+ if save:
249
+ config.save()
250
+
251
+ _config = config
252
+ return config
mcp_server.py ADDED
@@ -0,0 +1,224 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """SHARP MCP Server for programmatic access to 3D Gaussian prediction.
2
+
3
+ Run standalone:
4
+ uv run python mcp_server.py
5
+
6
+ Or integrate with MCP clients via stdio transport.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import json
12
+ import os
13
+ from pathlib import Path
14
+ from typing import Literal
15
+
16
+ import torch
17
+ from mcp.server.fastmcp import FastMCP
18
+
19
+ from model_utils import (
20
+ DEFAULT_OUTPUTS_DIR,
21
+ ModelWrapper,
22
+ TrajectoryType,
23
+ get_global_model,
24
+ )
25
+
26
+ MCP_PORT: int = int(os.getenv("SHARP_MCP_PORT", "49201"))
27
+
28
+ mcp = FastMCP(
29
+ "sharp",
30
+ description="SHARP: Single-image 3D Gaussian scene prediction",
31
+ )
32
+
33
+ # -----------------------------------------------------------------------------
34
+ # Tools
35
+ # -----------------------------------------------------------------------------
36
+
37
+
38
+ @mcp.tool()
39
+ def sharp_predict(
40
+ image_path: str,
41
+ render_video: bool = True,
42
+ trajectory_type: TrajectoryType = "rotate_forward",
43
+ num_frames: int = 60,
44
+ fps: int = 30,
45
+ output_long_side: int | None = None,
46
+ ) -> dict:
47
+ """Predict 3D Gaussians from a single image.
48
+
49
+ Args:
50
+ image_path: Absolute path to input image (jpg/png/webp).
51
+ render_video: Whether to render a camera trajectory video (requires CUDA).
52
+ trajectory_type: Camera trajectory type (swipe/shake/rotate/rotate_forward).
53
+ num_frames: Number of frames for video rendering.
54
+ fps: Frames per second for video.
55
+ output_long_side: Output resolution (longest side). None = match input.
56
+
57
+ Returns:
58
+ dict with keys:
59
+ - ply_path: Path to exported PLY file
60
+ - video_path: Path to rendered MP4 (or null if not rendered)
61
+ - cuda_available: Whether CUDA was available
62
+ """
63
+ image_path_obj = Path(image_path)
64
+ if not image_path_obj.exists():
65
+ raise FileNotFoundError(f"Image not found: {image_path}")
66
+
67
+ model = get_global_model()
68
+ video_path, ply_path = model.predict_and_maybe_render(
69
+ image_path_obj,
70
+ trajectory_type=trajectory_type,
71
+ num_frames=num_frames,
72
+ fps=fps,
73
+ output_long_side=output_long_side,
74
+ render_video=render_video,
75
+ )
76
+
77
+ return {
78
+ "ply_path": str(ply_path),
79
+ "video_path": str(video_path) if video_path else None,
80
+ "cuda_available": torch.cuda.is_available(),
81
+ }
82
+
83
+
84
+ @mcp.tool()
85
+ def sharp_render(
86
+ ply_path: str,
87
+ trajectory_type: TrajectoryType = "rotate_forward",
88
+ num_frames: int = 60,
89
+ fps: int = 30,
90
+ output_long_side: int | None = None,
91
+ ) -> dict:
92
+ """Render a video from an existing PLY file.
93
+
94
+ Note: This requires re-predicting from the original image since Gaussians
95
+ are not stored in standard PLY format. For now, returns an error.
96
+ Future versions may support loading Gaussians from PLY.
97
+
98
+ Args:
99
+ ply_path: Path to PLY file (from previous prediction).
100
+ trajectory_type: Camera trajectory type.
101
+ num_frames: Number of frames.
102
+ fps: Frames per second.
103
+ output_long_side: Output resolution.
104
+
105
+ Returns:
106
+ dict with error message (feature not yet implemented).
107
+ """
108
+ return {
109
+ "error": "Rendering from PLY not yet implemented. Use sharp_predict with render_video=True.",
110
+ "hint": "PLY files store only point data, not the full Gaussian parameters needed for rendering.",
111
+ }
112
+
113
+
114
+ @mcp.tool()
115
+ def list_outputs() -> dict:
116
+ """List all generated output files (PLY and MP4).
117
+
118
+ Returns:
119
+ dict with keys:
120
+ - outputs_dir: Path to outputs directory
121
+ - ply_files: List of PLY file paths
122
+ - video_files: List of MP4 file paths
123
+ """
124
+ outputs_dir = DEFAULT_OUTPUTS_DIR
125
+ ply_files = sorted(outputs_dir.glob("*.ply"))
126
+ video_files = sorted(outputs_dir.glob("*.mp4"))
127
+
128
+ return {
129
+ "outputs_dir": str(outputs_dir),
130
+ "ply_files": [str(f) for f in ply_files],
131
+ "video_files": [str(f) for f in video_files],
132
+ }
133
+
134
+
135
+ # -----------------------------------------------------------------------------
136
+ # Resources
137
+ # -----------------------------------------------------------------------------
138
+
139
+
140
+ @mcp.resource("sharp://info")
141
+ def get_info() -> str:
142
+ """Get SHARP server info including GPU status and configuration."""
143
+ cuda_available = torch.cuda.is_available()
144
+ gpu_info = []
145
+
146
+ if cuda_available:
147
+ for i in range(torch.cuda.device_count()):
148
+ props = torch.cuda.get_device_properties(i)
149
+ gpu_info.append({
150
+ "index": i,
151
+ "name": props.name,
152
+ "total_memory_gb": round(props.total_memory / (1024**3), 2),
153
+ "compute_capability": f"{props.major}.{props.minor}",
154
+ })
155
+
156
+ info = {
157
+ "model": "SHARP (Apple ml-sharp)",
158
+ "description": "Single-image 3D Gaussian scene prediction",
159
+ "cuda_available": cuda_available,
160
+ "cuda_device_count": torch.cuda.device_count() if cuda_available else 0,
161
+ "gpus": gpu_info,
162
+ "outputs_dir": str(DEFAULT_OUTPUTS_DIR),
163
+ "checkpoint_sources": [
164
+ "SHARP_CHECKPOINT_PATH env var",
165
+ "HuggingFace Hub (apple/Sharp)",
166
+ "Upstream CDN (torch.hub)",
167
+ ],
168
+ "env_vars": {
169
+ "SHARP_CHECKPOINT_PATH": os.getenv("SHARP_CHECKPOINT_PATH", "(not set)"),
170
+ "SHARP_KEEP_MODEL_ON_DEVICE": os.getenv("SHARP_KEEP_MODEL_ON_DEVICE", "1"),
171
+ "CUDA_VISIBLE_DEVICES": os.getenv("CUDA_VISIBLE_DEVICES", "(not set)"),
172
+ },
173
+ }
174
+
175
+ return json.dumps(info, indent=2)
176
+
177
+
178
+ @mcp.resource("sharp://help")
179
+ def get_help() -> str:
180
+ """Get usage help for the SHARP MCP server."""
181
+ help_text = """
182
+ # SHARP MCP Server
183
+
184
+ ## Tools
185
+
186
+ ### sharp_predict
187
+ Predict 3D Gaussians from a single image.
188
+
189
+ Parameters:
190
+ - image_path (required): Absolute path to input image
191
+ - render_video: Whether to render MP4 (default: true, requires CUDA)
192
+ - trajectory_type: swipe | shake | rotate | rotate_forward (default: rotate_forward)
193
+ - num_frames: Number of video frames (default: 60)
194
+ - fps: Video frame rate (default: 30)
195
+ - output_long_side: Output resolution, null = match input
196
+
197
+ ### list_outputs
198
+ List all generated PLY and MP4 files.
199
+
200
+ ## Resources
201
+
202
+ ### sharp://info
203
+ Server info, GPU status, configuration.
204
+
205
+ ### sharp://help
206
+ This help text.
207
+
208
+ ## Environment Variables
209
+
210
+ - SHARP_MCP_PORT: MCP server port (default: 49201)
211
+ - SHARP_CHECKPOINT_PATH: Local checkpoint path override
212
+ - SHARP_KEEP_MODEL_ON_DEVICE: Keep model on GPU (default: 1)
213
+ - CUDA_VISIBLE_DEVICES: GPU selection (e.g., "0" or "0,1")
214
+ """
215
+ return help_text.strip()
216
+
217
+
218
+ # -----------------------------------------------------------------------------
219
+ # Main
220
+ # -----------------------------------------------------------------------------
221
+
222
+ if __name__ == "__main__":
223
+ # Run as stdio transport for MCP clients
224
+ mcp.run()
model_utils.py CHANGED
@@ -23,10 +23,13 @@ from typing import Final, Literal
23
 
24
  import torch
25
 
 
26
  try:
27
  import spaces
28
- except Exception: # pragma: no cover
 
29
  spaces = None # type: ignore[assignment]
 
30
 
31
  try:
32
  # Prefer HF cache / Hub downloads (works with Spaces `preload_from_hub`).
@@ -175,15 +178,19 @@ class ModelWrapper:
175
 
176
  self.device_preference = device_preference
177
 
178
- # For ZeroGPU, it's safer to not keep large tensors on CUDA across calls.
179
  if keep_model_on_device is None:
180
- keep_env = (
181
- os.getenv("SHARP_KEEP_MODEL_ON_DEVICE")
182
- )
183
- self.keep_model_on_device = keep_env == "1"
184
  else:
185
  self.keep_model_on_device = keep_model_on_device
186
 
 
 
 
 
 
 
187
  self._lock = threading.RLock()
188
  self._predictor: torch.nn.Module | None = None
189
  self._predictor_device: torch.device | None = None
@@ -560,16 +567,8 @@ class ModelWrapper:
560
 
561
 
562
  # -----------------------------------------------------------------------------
563
- # ZeroGPU entrypoints
564
  # -----------------------------------------------------------------------------
565
- #
566
- # IMPORTANT: Do NOT decorate bound instance methods with `@spaces.GPU` on ZeroGPU.
567
- # The wrapper uses multiprocessing queues and pickles args/kwargs. If `self` is
568
- # included, Python will try to pickle the whole instance. ModelWrapper contains
569
- # a threading.RLock (not pickleable) and the model itself should not be pickled.
570
- #
571
- # Expose module-level functions that accept only pickleable arguments and
572
- # create/cache the ModelWrapper inside the GPU worker process.
573
 
574
  DEFAULT_OUTPUTS_DIR: Final[Path] = _ensure_dir(Path(__file__).resolve().parent / "outputs")
575
 
@@ -605,8 +604,60 @@ def predict_and_maybe_render(
605
  )
606
 
607
 
608
- # Export the GPU-wrapped callable (or a no-op wrapper locally).
609
- if spaces is not None:
610
- predict_and_maybe_render_gpu = spaces.GPU(duration=180)(predict_and_maybe_render)
611
- else: # pragma: no cover
612
- predict_and_maybe_render_gpu = predict_and_maybe_render
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  import torch
25
 
26
+ # Optional Spaces GPU support (for HuggingFace Spaces deployment)
27
  try:
28
  import spaces
29
+ _SPACES_AVAILABLE = True
30
+ except ImportError:
31
  spaces = None # type: ignore[assignment]
32
+ _SPACES_AVAILABLE = False
33
 
34
  try:
35
  # Prefer HF cache / Hub downloads (works with Spaces `preload_from_hub`).
 
178
 
179
  self.device_preference = device_preference
180
 
181
+ # Local CUDA: keep model on device by default for better performance
182
  if keep_model_on_device is None:
183
+ keep_env = os.getenv("SHARP_KEEP_MODEL_ON_DEVICE", "1")
184
+ self.keep_model_on_device = keep_env != "0"
 
 
185
  else:
186
  self.keep_model_on_device = keep_model_on_device
187
 
188
+ # Support CUDA device selection via env var
189
+ cuda_device = os.getenv("CUDA_VISIBLE_DEVICES")
190
+ if cuda_device and device_preference == "auto":
191
+ # Let PyTorch handle device mapping via CUDA_VISIBLE_DEVICES
192
+ pass
193
+
194
  self._lock = threading.RLock()
195
  self._predictor: torch.nn.Module | None = None
196
  self._predictor_device: torch.device | None = None
 
567
 
568
 
569
  # -----------------------------------------------------------------------------
570
+ # Module-level entrypoints
571
  # -----------------------------------------------------------------------------
 
 
 
 
 
 
 
 
572
 
573
  DEFAULT_OUTPUTS_DIR: Final[Path] = _ensure_dir(Path(__file__).resolve().parent / "outputs")
574
 
 
604
  )
605
 
606
 
607
+ # -----------------------------------------------------------------------------
608
+ # GPU-wrapped entrypoint (Spaces or local)
609
+ # -----------------------------------------------------------------------------
610
+
611
+
612
+ def _create_spaces_gpu_wrapper(duration: int = 180):
613
+ """Create a Spaces GPU-wrapped version of predict_and_maybe_render.
614
+
615
+ This is called dynamically based on hardware configuration.
616
+ """
617
+ if spaces is not None and _SPACES_AVAILABLE:
618
+ return spaces.GPU(duration=duration)(predict_and_maybe_render)
619
+ return predict_and_maybe_render
620
+
621
+
622
+ # Default export: use local CUDA unless explicitly configured for Spaces
623
+ # The actual wrapper is created dynamically based on hardware_config
624
+ predict_and_maybe_render_gpu = predict_and_maybe_render
625
+
626
+
627
+ def configure_gpu_mode(use_spaces: bool = False, duration: int = 180) -> None:
628
+ """Configure the GPU mode at runtime.
629
+
630
+ Args:
631
+ use_spaces: If True and spaces module available, use @spaces.GPU decorator
632
+ duration: Duration for @spaces.GPU decorator (seconds)
633
+ """
634
+ global predict_and_maybe_render_gpu
635
+
636
+ if use_spaces and _SPACES_AVAILABLE and spaces is not None:
637
+ predict_and_maybe_render_gpu = spaces.GPU(duration=duration)(predict_and_maybe_render)
638
+ else:
639
+ predict_and_maybe_render_gpu = predict_and_maybe_render
640
+
641
+
642
+ def get_gpu_status() -> dict:
643
+ """Get current GPU status information."""
644
+ import torch
645
+
646
+ status = {
647
+ "cuda_available": torch.cuda.is_available(),
648
+ "spaces_available": _SPACES_AVAILABLE,
649
+ "device_count": torch.cuda.device_count() if torch.cuda.is_available() else 0,
650
+ "devices": [],
651
+ }
652
+
653
+ if torch.cuda.is_available():
654
+ for i in range(torch.cuda.device_count()):
655
+ props = torch.cuda.get_device_properties(i)
656
+ status["devices"].append({
657
+ "index": i,
658
+ "name": props.name,
659
+ "total_memory_gb": round(props.total_memory / (1024**3), 2),
660
+ "compute_capability": f"{props.major}.{props.minor}",
661
+ })
662
+
663
+ return status
pyproject.toml CHANGED
@@ -7,8 +7,9 @@ requires-python = ">=3.13"
7
  dependencies = [
8
  "gradio==6.1.0",
9
  "huggingface-hub>=1.2.3",
 
10
  "sharp",
11
- "spaces==0.44.0",
12
  "torch>=2.9.1",
13
  "torchvision>=0.24.1",
14
  ]
 
7
  dependencies = [
8
  "gradio==6.1.0",
9
  "huggingface-hub>=1.2.3",
10
+ "mcp>=1.0.0",
11
  "sharp",
12
+ "spaces>=0.30.0",
13
  "torch>=2.9.1",
14
  "torchvision>=0.24.1",
15
  ]
requirements.txt CHANGED
@@ -1,6 +1,7 @@
1
  gradio==6.2.0
2
- spaces==0.44.0
3
  huggingface_hub>=1.2.3
 
4
  torch
5
  torchvision
6
  sharp @ git+https://github.com/apple/ml-sharp.git@cdb4ddc6796402bee5487c7312260f2edd8bd5f0
 
 
1
  gradio==6.2.0
 
2
  huggingface_hub>=1.2.3
3
+ spaces>=0.30.0
4
  torch
5
  torchvision
6
  sharp @ git+https://github.com/apple/ml-sharp.git@cdb4ddc6796402bee5487c7312260f2edd8bd5f0
7
+ mcp>=1.0.0