Demo-2025 / README.md
zye0616's picture
fix:frontend video display
51780f2
metadata
title: Demo 2025
emoji: πŸ“Š
colorFrom: green
colorTo: red
sdk: docker
pinned: false
license: mit

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Mission-guided detections

  1. Call POST /process_video with fields video (file), prompt (mission text), and optional detector (owlv2 or hf_yolov8). The response is an MP4 stream containing the annotated frames.
  2. Call POST /mission_summary with the same fields to receive JSON containing the structured mission plan plus the natural-language summary. This second endpoint isolates the OpenAI call, keeping the video response clean.
  3. Under the hood the mission text still feeds into the OpenAI (gpt-4o-mini) reasoning step that ranks the YOLO/COCO classes. Place your API key inside .env as either OPENAI_API_KEY=... or OpenAI-API: ...; the server loads it automatically on startup.
  4. The top scored classes drive OWLv2 or YOLOv8 to align detections with the mission, and the detection log is summarized via another OpenAI call when requested.