object_remover / README.md
HariLogicgo's picture
new gemini api
861422e
metadata
title: Remove Photo Object
emoji: 
colorFrom: pink
colorTo: purple
sdk: docker
pinned: false
license: mit
hardware: cpu-basic

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Inpainting API (FastAPI)

  • Auth: optional Bearer via env API_TOKEN. If set, send Authorization: Bearer <token>.

Endpoints:

  • GET /health → {"status":"healthy"}
  • POST /upload-image (form-data: image=file) → {"id":"","filename":"name.png"}
  • POST /upload-mask (form-data: mask=file) → {"id":"","filename":"mask.png"}
  • POST /inpaint (JSON: {image_id, mask_id}) → {"result":"output_xxx.png"}
  • POST /inpaint-url (JSON: {image_id, mask_id}) → {"result":"output_xxx.png","url":"https://.../download/output_xxx.png"}
  • POST /inpaint-multipart (form-data: image=file, mask=file) → {"result":"output_xxx.png"}
  • GET /download/{filename} → image file (public; optional for ID-based inpaint)
  • GET /result/{filename} → view result image in browser (public)
  • GET /logs → recent uploads/results

Note:

  • POST /inpaint returns simple JSON with just the filename.
  • POST /inpaint-url returns JSON with filename and shareable URL.
  • Use /download/{filename} or /result/{filename} to access the result image.
  • You can optionally pass a prompt string with /inpaint, /inpaint-url, or /inpaint-multipart to describe what should be removed; the mask still controls the exact region.

Remote inference (Gemini / Imagen edit)

  • Local processing is limited to mask prep and file IO; the heavy lifting is done via Google Gemini/Imagen edit API (CPU-only container is fine).
  • Required env: GEMINI_API_KEY (or GOOGLE_API_KEY / GOOGLE_GENAI_API_KEY).
  • Optional env:
    • GEMINI_IMAGE_EDIT_MODEL to override the model id (default: imagen-3.0-edit-001).
    • GEMINI_IMAGE_EDIT_PROMPT to override the default prompt: "Remove the objects marked by the provided mask and fill the background naturally."
  • Uploaded images/masks are stored in MongoDB GridFS in the object_remover database (MONGO_URI/MONGODB_URI env); IDs returned by /upload-image and /upload-mask are fetched back from GridFS before calling Gemini.

Local run

  • Install deps: python3 -m pip install -r requirements.txt
  • Run API: python3 -m uvicorn api.main:app --host 0.0.0.0 --port 7860