--- language: - en - zh tags: - gui-agent - phone-use agent - computer-use agent - pua - android - multimodal - gelab-zero license: apache-2.0 base_model: Qwen/Qwen3-VL-4B-Instruct library_name: transformers --- ## Model Details This model is part of the [**GELab-Zero**](https://github.com/stepfun-ai/gelab-zero) project, which aims to accelerate the innovation and application deployment of GUI Agents by providing: 1. **A 4B GUI Agent model** capable of running on local computers. 2. **Plug-and-play inference infrastructure** that handles ADB connections, dependency installation, and task recording/replay (**available in the** [**GELab-Zero**](https://github.com/stepfun-ai/gelab-zero)). ### Key Capabilities * **Local Deployment**: Optimized for consumer-grade hardware, balancing low latency with privacy. * **GUI Navigation**: Proficient in detecting and interacting with UI elements (click, type, slide, wait, etc.) based on visual cues. * **Complex Task Execution**: Handles multi-step long-horizon tasks across various apps (Food, Transportation, Shopping, Social, etc.). * **Open-World Generalization**: Capable of zero-shot operation across diverse unseen applications and complex dynamic interfaces without requiring app-specific adaptation. ## Usage ### Quick Start with Ollama The easiest way to run inference is using Ollama. 1.**Install Ollama**: Download from [ollama.com](https://ollama.com/). 2.**Download the Model**: ```bash # Install huggingface-cli pip install huggingface_hub # Download model huggingface-cli download --resume-download stepfun-ai/GELab-Zero-4B-preview --local-dir gelab-zero-4b-preview ``` 3.**Create and Run in Ollama**: ```bash cd gelab-zero-4b-preview ollama create gelab-zero-4b-preview -f Modelfile # Test the model curl -X POST http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gelab-zero-4b-preview", "messages": [{"role": "user", "content": "Hello, GELab-Zero!"}] }' ``` To use this model for actual Android device control (ADB connection, task execution), please use the [GELab-Zero](https://github.com/stepfun-ai/gelab-zero). ## Citation If you find GELab-Zero-4B-preview useful for your research, please consider citing our work :) ```bibtex @software{gelab_zero_2025, title={GELab-Zero: An Advanced Mobile Agent Inference System}, author={GELab Team}, year={2025}, url={https://github.com/stepfun-ai/gelab-zero} } @inproceedings{gelab_mt_rl, title={GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning}, author={Yan, Haolong and Shen, Yeqing and Huang, Xin and Wang, Jia and Tan, Kaijun and Liang, Zhixuan and Li, Hongxin and Ge, Zheng and Yoshie, Osamu and Li, Si and others}, booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems} } ```