TESS-Computer
/

tess-500m

@@ -7,22 +7,23 @@ tags:
 - gui-agent
 - vision-language-model
 - screen-understanding
 datasets:
-- TESS-Computer/agentnet
 base_model: HuggingFaceTB/SmolVLM2-500M-Instruct
 pipeline_tag: image-text-to-text
 ---
 # TESS-500M
-**TESS (Text-Enabled Screen Sense)** is a Vision-Language-Action model for computer use. Given a screenshot and natural language instruction, it predicts either a mouse action (click coordinates) or keyboard action (typing/shortcuts).
 ## Model Description
 - **Base Model**: SmolVLM2-500M-Instruct
 - **Architecture**: SmolVLM + Router + Mouse/Keyboard heads
 - **Parameters**: 508M total, 48M trainable
-- **Training Data**: [AgentNet](https://huggingface.co/datasets/TESS-Computer/agentnet) (~312K samples)
 ## Usage
@@ -31,7 +32,7 @@ import torch
 from PIL import Image
 # Clone the TESS repo
-# git clone https://github.com/yourusername/TESS.git
 # cd TESS/model
 from test_checkpoint import load_model, predict
@@ -101,9 +102,9 @@ Apache 2.0
 ```bibtex
 @misc{tess2024,
-  title={TESS: Text-Enabled Screen Sense},
   author={Hussein Lezzaik},
   year={2024},
-  url={https://github.com/yourusername/TESS}
 }
 ```

 - gui-agent
 - vision-language-model
 - screen-understanding
+- vla
 datasets:
+- TESS-Computer/tess-agentnet
 base_model: HuggingFaceTB/SmolVLM2-500M-Instruct
 pipeline_tag: image-text-to-text
 ---
 # TESS-500M
+**TESS** is a Vision-Language-Action (VLA) model for computer use, inspired by robotic VLAs. Given a screenshot and natural language instruction, it predicts either a mouse action (click coordinates) or keyboard action (typing/shortcuts).
 ## Model Description
 - **Base Model**: SmolVLM2-500M-Instruct
 - **Architecture**: SmolVLM + Router + Mouse/Keyboard heads
 - **Parameters**: 508M total, 48M trainable
+- **Training Data**: [tess-agentnet](https://huggingface.co/datasets/TESS-Computer/tess-agentnet) (~312K samples)
 ## Usage
 from PIL import Image
 # Clone the TESS repo
+# git clone https://github.com/husseinlezzaik/TESS.git
 # cd TESS/model
 from test_checkpoint import load_model, predict
 ```bibtex
 @misc{tess2024,
+  title={TESS: A Vision-Language-Action Model for Computer Use},
   author={Hussein Lezzaik},
   year={2024},
+  url={https://github.com/husseinlezzaik/TESS}
 }
 ```