Model Details

This model is part of the GELab-Zero project, which aims to accelerate the innovation and application deployment of GUI Agents by providing:

A 4B GUI Agent model capable of running on local computers.
Plug-and-play inference infrastructure that handles ADB connections, dependency installation, and task recording/replay (available in the GELab-Zero).

Key Capabilities

Local Deployment: Optimized for consumer-grade hardware, balancing low latency with privacy.
GUI Navigation: Proficient in detecting and interacting with UI elements (click, type, slide, wait, etc.) based on visual cues.
Complex Task Execution: Handles multi-step long-horizon tasks across various apps (Food, Transportation, Shopping, Social, etc.).
Open-World Generalization: Capable of zero-shot operation across diverse unseen applications and complex dynamic interfaces without requiring app-specific adaptation.

Usage

Quick Start with Ollama

The easiest way to run inference is using Ollama.

1.Install Ollama: Download from ollama.com.

2.Download the Model:

# Install huggingface-cli
pip install huggingface_hub
# Download model
huggingface-cli download --resume-download stepfun-ai/GELab-Zero-4B-preview --local-dir gelab-zero-4b-preview

3.Create and Run in Ollama:

cd gelab-zero-4b-preview
ollama create gelab-zero-4b-preview -f Modelfile
    
# Test the model
curl -X POST http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
        "model": "gelab-zero-4b-preview",
        "messages": [{"role": "user", "content": "Hello, GELab-Zero!"}]
      }'

To use this model for actual Android device control (ADB connection, task execution), please use the GELab-Zero.

Citation

If you find GELab-Zero-4B-preview useful for your research, please consider citing our work :)

@software{gelab_zero_2025,
  title={GELab-Zero: An Advanced Mobile Agent Inference System},
  author={GELab Team},
  year={2025},
  url={https://github.com/stepfun-ai/gelab-zero}
}

@inproceedings{gelab_mt_rl,
  title={GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning},
  author={Yan, Haolong and Shen, Yeqing and Huang, Xin and Wang, Jia and Tan, Kaijun and Liang, Zhixuan and Li, Hongxin and Ge, Zheng and Yoshie, Osamu and Li, Si and others},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
}

Downloads last month: 648

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for stepfun-ai/GELab-Zero-4B-preview

Base model

Qwen/Qwen3-VL-4B-Instruct

Finetuned

(117)

this model

Quantizations

4 models