stepfun-ai
/

GELab-Zero-4B-preview

phone-use agent

computer-use agent

Model card Files Files and versions

GELab-Zero-4B-preview / README.md

super-rikka's picture

Adding `transformers` as the library name (#1)

f12922d verified 8 days ago

|

history blame contribute delete

2.89 kB

	---
	language:
	- en
	- zh
	tags:
	- gui-agent
	- phone-use agent
	- computer-use agent
	- pua
	- android
	- multimodal
	- gelab-zero
	license: apache-2.0
	base_model: Qwen/Qwen3-VL-4B-Instruct
	library_name: transformers
	---

	## Model Details

	This model is part of the [GELab-Zero](https://github.com/stepfun-ai/gelab-zero) project, which aims to accelerate the innovation and application deployment of GUI Agents by providing:
	1. A 4B GUI Agent model capable of running on local computers.
	2. Plug-and-play inference infrastructure that handles ADB connections, dependency installation, and task recording/replay (available in the [GELab-Zero](https://github.com/stepfun-ai/gelab-zero)).

	### Key Capabilities
	* Local Deployment: Optimized for consumer-grade hardware, balancing low latency with privacy.
	* GUI Navigation: Proficient in detecting and interacting with UI elements (click, type, slide, wait, etc.) based on visual cues.
	* Complex Task Execution: Handles multi-step long-horizon tasks across various apps (Food, Transportation, Shopping, Social, etc.).
	* Open-World Generalization: Capable of zero-shot operation across diverse unseen applications and complex dynamic interfaces without requiring app-specific adaptation.

	## Usage

	### Quick Start with Ollama

	The easiest way to run inference is using Ollama.

	1.Install Ollama: Download from [ollama.com](https://ollama.com/).

	2.Download the Model:

	```bash
	# Install huggingface-cli
	pip install huggingface_hub
	# Download model
	huggingface-cli download --resume-download stepfun-ai/GELab-Zero-4B-preview --local-dir gelab-zero-4b-preview
	```

	3.Create and Run in Ollama:

	```bash
	cd gelab-zero-4b-preview
	ollama create gelab-zero-4b-preview -f Modelfile

	# Test the model
	curl -X POST http://localhost:11434/v1/chat/completions \
	-H "Content-Type: application/json" \
	-d '{
	"model": "gelab-zero-4b-preview",
	"messages": [{"role": "user", "content": "Hello, GELab-Zero!"}]
	}'
	```

	To use this model for actual Android device control (ADB connection, task execution), please use the [GELab-Zero](https://github.com/stepfun-ai/gelab-zero).


	## Citation

	If you find GELab-Zero-4B-preview useful for your research, please consider citing our work :)

	```bibtex
	@software{gelab_zero_2025,
	title={GELab-Zero: An Advanced Mobile Agent Inference System},
	author={GELab Team},
	year={2025},
	url={https://github.com/stepfun-ai/gelab-zero}
	}

	@inproceedings{gelab_mt_rl,
	title={GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning},
	author={Yan, Haolong and Shen, Yeqing and Huang, Xin and Wang, Jia and Tan, Kaijun and Liang, Zhixuan and Li, Hongxin and Ge, Zheng and Yoshie, Osamu and Li, Si and others},
	booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
	}
	```