Spaces:

yuhangzang
/

Spatial-SSRL

Running on Zero

App Files Files Community

Spatial-SSRL / README.md

yuhangzang

Add Gradio Space for Spatial-SSRL spatial reasoning demo

1e5cd04 about 1 month ago

preview code

raw

history blame contribute delete

1.39 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: Spatial-SSRL Spatial Reasoning
emoji: 🌍
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Spatial reasoning with vision-language models

🌍 Spatial-SSRL: Spatial Reasoning with Vision-Language Models

This demo showcases the spatial reasoning capabilities of vision-language models trained to understand 3D spatial relationships from 2D images.

Features

3D Location Understanding: Determine which objects are closer or further from the camera
Orientation Analysis: Understand which direction objects are facing
Relative Positioning: Answer questions about object positions relative to each other
Step-by-step Reasoning: The model provides detailed reasoning before answering

How to Use

Upload an image
Ask a question about spatial relationships in the image
The model will provide a detailed answer with reasoning

Example Questions

"Which object is further away from the camera? A. boat B. fire hydrant"
"Are the kid and the teddy bear facing same or similar directions?"
"If I stand at the recreational vehicle's position facing where it is facing, is the dog in front of me or behind me?"

The model is trained to provide answers in a structured format with reasoning enclosed in <think> tags and final answers in \boxed{}.