Spaces:
Running
on
Zero
Running
on
Zero
| title: L Operator Demo | |
| emoji: π | |
| colorFrom: purple | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: 5.44.0 | |
| app_file: app.py | |
| pinned: true | |
| license: gpl | |
| short_description: demo of l-operator with no commands | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| # π€ L-Operator: Android Device Control Demo | |
| A complete multimodal Gradio demo for the [L-Operator model](https://huggingface.co/Tonic/l-android-control), a fine-tuned multimodal AI agent based on LiquidAI's LFM2-VL-1.6B model, optimized for Android device control through visual understanding and action generation. | |
| ## π Features | |
| - **Multimodal Interface**: Upload Android screenshots and provide text instructions | |
| - **Chat Interface**: Interactive chat with the model using Gradio's ChatInterface component | |
| - **Action Generation**: Generate JSON actions for Android device control | |
| - **Example Episodes**: Pre-loaded examples from extracted training episodes | |
| - **Real-time Processing**: Optimized for real-time inference | |
| - **Beautiful UI**: Modern, responsive interface with comprehensive documentation | |
| - **β‘ ZeroGPU Compatible**: Dynamic GPU allocation for cost-effective deployment | |
| ## π Model Details | |
| | Property | Value | | |
| |----------|-------| | |
| | **Base Model** | [LiquidAI/LFM2-VL-1.6B](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | | |
| | **Architecture** | LFM2-VL (1.6B parameters) | | |
| | **Fine-tuning** | LoRA (Low-Rank Adaptation) | | |
| | **Training Data** | Android control episodes with screenshots and actions | | |
| | **License** | Proprietary (Investment Access Required) | | |
| ## π Quick Start | |
| ### Prerequisites | |
| 1. **Python 3.8+**: Ensure you have Python 3.8 or higher installed | |
| 2. **Hugging Face Access**: Request access to the [L-Operator model](https://huggingface.co/Tonic/l-android-control) | |
| 3. **Authentication**: Login to Hugging Face using `huggingface-cli login` | |
| ### Installation | |
| 1. **Clone the repository**: | |
| ```bash | |
| git clone <repository-url> | |
| cd l-operator-demo | |
| ``` | |
| 2. **Install dependencies**: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Authenticate with Hugging Face**: | |
| ```bash | |
| huggingface-cli login | |
| ``` | |
| ### Running the Demo | |
| 1. **Start the demo**: | |
| ```bash | |
| python app.py | |
| ``` | |
| 2. **Open your browser** and navigate to `http://localhost:7860` | |
| 3. **Load the model** by clicking the "π Load L-Operator Model" button | |
| 4. **Upload an Android screenshot** and provide instructions | |
| 5. **Generate actions** or use the chat interface | |
| ## β‘ ZeroGPU Deployment | |
| This demo is optimized for [Hugging Face Spaces ZeroGPU](https://huggingface.co/docs/hub/spaces-zerogpu), providing dynamic GPU allocation for cost-effective deployment. | |
| ### ZeroGPU Features | |
| - **π Free GPU Access**: Dynamic NVIDIA H200 GPU allocation | |
| - **β‘ On-Demand Resources**: GPUs allocated only when needed | |
| - **π° Cost Efficient**: Optimized resource utilization | |
| - **π Multi-GPU Support**: Leverage multiple GPUs concurrently | |
| - **π‘οΈ Automatic Management**: Resources released after function completion | |
| ### ZeroGPU Specifications | |
| | Specification | Value | | |
| |---------------|-------| | |
| | **GPU Type** | NVIDIA H200 slice | | |
| | **Available VRAM** | 70GB per workload | | |
| | **Supported Gradio** | 4+ | | |
| | **Supported PyTorch** | 2.1.2, 2.2.2, 2.4.0, 2.5.1 | | |
| | **Supported Python** | 3.10.13 | | |
| | **Function Duration** | Up to 120 seconds per request | | |
| ### Deploying to Hugging Face Spaces | |
| 1. **Create a new Space** on Hugging Face: | |
| - Choose **Gradio SDK** | |
| - Select **ZeroGPU** in hardware options | |
| - Upload your code | |
| 2. **Space Configuration**: | |
| ```yaml | |
| # app.py is automatically detected | |
| # requirements.txt is automatically installed | |
| # ZeroGPU is automatically configured | |
| ``` | |
| 3. **Access Requirements**: | |
| - **Personal accounts**: PRO subscription required | |
| - **Organizations**: Enterprise Hub subscription required | |
| - **Usage limits**: 10 Spaces (personal) / 50 Spaces (organization) | |
| ### ZeroGPU Integration Details | |
| The demo automatically detects ZeroGPU availability and optimizes accordingly: | |
| ```python | |
| # Automatic ZeroGPU detection | |
| try: | |
| import spaces | |
| ZEROGPU_AVAILABLE = True | |
| except ImportError: | |
| ZEROGPU_AVAILABLE = False | |
| # GPU-optimized functions | |
| @spaces.GPU(duration=120) # 2 minutes for action generation | |
| def generate_action(self, image, goal, instruction): | |
| # GPU-accelerated inference | |
| pass | |
| @spaces.GPU(duration=90) # 1.5 minutes for chat responses | |
| def chat_with_model(self, message, history, image): | |
| # Interactive chat with GPU acceleration | |
| pass | |
| ``` | |
| ## π― How to Use | |
| ### Basic Usage | |
| 1. **Load Model**: Click "π Load L-Operator Model" to initialize the model | |
| 2. **Upload Screenshot**: Upload an Android device screenshot | |
| 3. **Provide Instructions**: | |
| - **Goal**: Describe what you want to achieve | |
| - **Step**: Provide specific step instructions | |
| 4. **Generate Action**: Click "π― Generate Action" to get JSON output | |
| ### Chat Interface | |
| 1. **Upload Screenshot**: Upload an Android screenshot | |
| 2. **Send Message**: Use structured format: | |
| ``` | |
| Goal: Open the Settings app and navigate to Display settings | |
| Step: Tap on the Settings app icon on the home screen | |
| ``` | |
| 3. **Get Response**: The model will generate JSON actions | |
| ### Example Episodes | |
| The demo includes pre-loaded examples from the training episodes: | |
| - **Episode 13**: Cruise deals app navigation | |
| - **Episode 53**: Pinterest search for sustainability art | |
| - **Episode 73**: Moon phases app usage | |
| ## π Expected Output Format | |
| The model generates JSON actions in the following format: | |
| ```json | |
| { | |
| "action_type": "tap", | |
| "x": 540, | |
| "y": 1200, | |
| "text": "Settings", | |
| "app_name": "com.android.settings", | |
| "confidence": 0.92 | |
| } | |
| ``` | |
| ### Action Types | |
| - `tap`: Tap at specific coordinates | |
| - `click`: Click at specific coordinates | |
| - `scroll`: Scroll in a direction (up/down/left/right) | |
| - `input_text`: Input text | |
| - `open_app`: Open a specific app | |
| - `wait`: Wait for a moment | |
| ## π οΈ Technical Details | |
| ### Model Configuration | |
| - **Device**: Automatically detects CUDA/CPU | |
| - **Precision**: bfloat16 for CUDA, float32 for CPU | |
| - **Generation**: Temperature 0.7, Top-p 0.9 | |
| - **Max Tokens**: 128 for action generation | |
| ### Architecture | |
| - **Base Model**: LFM2-VL-1.6B from LiquidAI | |
| - **Fine-tuning**: LoRA with rank 16, alpha 32 | |
| - **Target Modules**: q_proj, v_proj, fc1, fc2, linear, gate_proj, up_proj, down_proj | |
| ### Performance | |
| - **Model Size**: ~1.6B parameters | |
| - **Memory Usage**: ~4GB VRAM (CUDA) / ~8GB RAM (CPU) | |
| - **Inference Speed**: Optimized for real-time use | |
| - **Accuracy**: 98% action accuracy on test episodes | |
| ## π― Use Cases | |
| ### 1. Mobile App Testing | |
| - Automated UI testing for Android applications | |
| - Cross-device compatibility validation | |
| - Regression testing with visual verification | |
| ### 2. Accessibility Applications | |
| - Voice-controlled device navigation | |
| - Assistive technology integration | |
| - Screen reader enhancement tools | |
| ### 3. Remote Support | |
| - Remote device troubleshooting | |
| - Automated device configuration | |
| - Support ticket automation | |
| ### 4. Development Workflows | |
| - UI/UX testing automation | |
| - User flow validation | |
| - Performance testing integration | |
| ## β οΈ Important Notes | |
| ### Access Requirements | |
| - **Investment Access**: This model is proprietary technology available exclusively to qualified investors under NDA | |
| - **Authentication Required**: Must be authenticated with Hugging Face | |
| - **Evaluation Only**: Access granted solely for investment evaluation purposes | |
| - **Confidentiality**: All technical details are confidential | |
| ### ZeroGPU Limitations | |
| - **Compatibility**: Currently exclusive to Gradio SDK | |
| - **PyTorch Versions**: Limited to supported versions (2.1.2, 2.2.2, 2.4.0, 2.5.1) | |
| - **Function Duration**: Maximum 60 seconds default, customizable up to 120 seconds | |
| - **Queue Priority**: PRO users get x5 more daily usage and highest priority | |
| ### General Limitations | |
| - **Market Hours**: Some features may be limited during market hours | |
| - **Device Requirements**: Requires sufficient RAM/VRAM for model loading | |
| - **Network**: Requires internet connection for model download | |
| - **Authentication**: Must have approved access to the model | |
| ## π§ Troubleshooting | |
| ### Common Issues | |
| 1. **Model Loading Error**: | |
| - Ensure you're authenticated: `huggingface-cli login` | |
| - Check internet connection | |
| - Verify model access approval | |
| 2. **Memory Issues**: | |
| - Use CPU if GPU memory is insufficient | |
| - Close other applications | |
| - Consider using smaller batch sizes | |
| 3. **Authentication Errors**: | |
| - Re-login to Hugging Face | |
| - Check access approval status | |
| - Contact support if issues persist | |
| 4. **ZeroGPU Issues**: | |
| - Verify ZeroGPU is selected in Space settings | |
| - Check PyTorch version compatibility | |
| - Ensure function duration is within limits | |
| ### Performance Optimization | |
| - **GPU Usage**: Use CUDA for faster inference | |
| - **Memory Management**: Monitor VRAM usage | |
| - **Batch Processing**: Process multiple images efficiently | |
| - **ZeroGPU Optimization**: Specify appropriate function durations | |
| ## π Support | |
| - **Investment Inquiries**: For investment-related questions and due diligence | |
| - **Technical Support**: For technical issues with the demo | |
| - **Model Access**: For access requests to the L-Operator model | |
| - **ZeroGPU Support**: [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu) | |
| ## π License | |
| This demo is provided under the same terms as the L-Operator model: | |
| - **Proprietary Technology**: Owned by Tonic | |
| - **Investment Evaluation**: Access granted solely for investment evaluation | |
| - **NDA Required**: All access is subject to Non-Disclosure Agreement | |
| - **No Commercial Use**: Without written consent | |
| ## π Acknowledgments | |
| - **LiquidAI**: For the base LFM2-VL model | |
| - **Hugging Face**: For the transformers library, hosting, and ZeroGPU infrastructure | |
| - **Gradio**: For the excellent UI framework | |
| ## π Links | |
| - [L-Operator Model](https://huggingface.co/Tonic/l-android-control) | |
| - [Base Model (LFM2-VL-1.6B)](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | |
| - [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu) | |
| - [LiquidAI](https://liquid.ai/) | |
| - [Tonic](https://tonic.ai/) | |
| --- | |
| **Made with β€οΈ by Tonic** | |