Spaces:

Alovestocode
/

router-router-zero

Running on Zero

Alovestocode commited on Nov 6

Commit

405a7ef

verified ·

1 Parent(s): ee25577

Restore spaces.GPU usage

Files changed (4) hide show

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ endpoint via the `HF_ROUTER_API` environment variable.
 | File | Purpose |
 | ---- | ------- |
 | `app.py` | Loads the merged checkpoint on demand (tries `MODEL_REPO` first, then `router-qwen3-32b-merged`, `router-gemma3-merged`), exposes a `/v1/generate` API, and serves a small HTML console at `/gradio`. |
-| `requirements.txt` | Minimal dependency set (transformers, bitsandbytes, torch, fastapi). |
 | `.huggingface/spaces.yml` | Configures the Space for ZeroGPU hardware and disables automatic sleep. |
 ## Deployment Steps

 | File | Purpose |
 | ---- | ------- |
 | `app.py` | Loads the merged checkpoint on demand (tries `MODEL_REPO` first, then `router-qwen3-32b-merged`, `router-gemma3-merged`), exposes a `/v1/generate` API, and serves a small HTML console at `/gradio`. |
+| `requirements.txt` | Minimal dependency set (transformers, bitsandbytes, torch, fastapi, spaces). |
 | `.huggingface/spaces.yml` | Configures the Space for ZeroGPU hardware and disables automatic sleep. |
 ## Deployment Steps

__pycache__/app.cpython-313.pyc CHANGED Viewed

Binary files a/__pycache__/app.cpython-313.pyc and b/__pycache__/app.cpython-313.pyc differ

app.py CHANGED Viewed

@@ -9,6 +9,11 @@ from fastapi import FastAPI, HTTPException
 from fastapi.responses import HTMLResponse
 from pydantic import BaseModel
 from transformers import (
     AutoModelForCausalLM,
     AutoTokenizer,
@@ -66,6 +71,15 @@ class GenerateResponse(BaseModel):
 _MODEL = None
 def get_model() -> AutoModelForCausalLM:
     global _MODEL
     if _MODEL is None:

 from fastapi.responses import HTMLResponse
 from pydantic import BaseModel
+try:
+    import spaces  # type: ignore
+except Exception:  # pragma: no cover
+    spaces = None
 from transformers import (
     AutoModelForCausalLM,
     AutoTokenizer,
 _MODEL = None
+def _spaces_gpu(*args, **kwargs):
+    if spaces is None:
+        def identity(fn):
+            return fn
+        return identity
+    return spaces.GPU(*args, **kwargs)
+@_spaces_gpu(duration=120)
 def get_model() -> AutoModelForCausalLM:
     global _MODEL
     if _MODEL is None:

requirements.txt CHANGED Viewed

@@ -1,5 +1,6 @@
 bitsandbytes>=0.41.0
 fastapi>=0.110.0
 torch>=2.1.0
 transformers>=4.40.0
 uvicorn>=0.22.0

 bitsandbytes>=0.41.0
 fastapi>=0.110.0
+spaces>=0.40.0
 torch>=2.1.0
 transformers>=4.40.0
 uvicorn>=0.22.0