Spaces:

WellGoods
/

VibeThinker

Sleeping

App Files Files Community

VladBoyko commited on 22 days ago

Commit

582b310

verified ·

1 Parent(s): efd8236

Update README.md

Browse files

Files changed (1) hide show

README.md +85 -97

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
-title: VibeThinker 1.5B Advanced
-emoji: 🧠⚡
-colorFrom: blue
 colorTo: purple
 sdk: gradio
 sdk_version: 5.49.1
@@ -10,121 +10,109 @@ pinned: false
 license: mit
 ---
-# VibeThinker-1.5B: Advanced Interface with vLLM
-A high-performance reasoning model interface featuring:
-## ✨ Key Features
-- ⚡ **10-20x Faster Inference** with vLLM optimization
-- 🤔 **Collapsible Thinking Sections** - Explore the model's reasoning process
-- 💻 **Interactive Code Blocks** - Copy or download code with one click
-- 📝 **Structured Output Parsing** - Clean separation of thoughts, text, and code
-- 🎨 **Beautiful UI** - Modern, responsive design with syntax highlighting
-## 🚀 Performance Highlights
-| Model | Parameters | AIME24 | AIME25 | Training Cost |
-|-------|------------|--------|--------|---------------|
-| VibeThinker-1.5B | **1.5B** | **80.3** | **74.4** | **$7,800** |
-| DeepSeek R1 | 671B | 79.8 | 70.0 | $294,000+ |
-**400× smaller, yet outperforms on math benchmarks!**
-## 📖 Usage
-1. Enter your math problem or coding challenge (English works best)
-2. Adjust temperature and max tokens if needed
-3. Click "Generate Solution"
-4. Explore the thinking process, read the response, and interact with code blocks
-## 🔧 Technical Details
-- **Backend**: vLLM for optimized inference
-- **GPU**: Runs efficiently on Nvidia T4 (16GB VRAM)
-- **Context Length**: Supports up to 40,960 tokens
-- **Output Parsing**: Automatic detection of thinking, text, and code sections
-## 📚 Resources
-- [GitHub Repository](https://github.com/WeiboAI/VibeThinker)
-- [Technical Report](https://huggingface.co/papers/2511.06221)
-- [Model Card](https://huggingface.co/WeiboAI/VibeThinker-1.5B)
-## 🙏 Citation
-```bibtex
-@misc{xu2025tinymodelbiglogic,
-    title={Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B},
-    author={Sen Xu and Yi Zhou and Wei Wang and Jixin Min and Zhibin Yin and Yingwei Dai and Shixi Liu and Lianyu Pang and Yirong Chen and Junlin Zhang},
-    year={2025},
-    eprint={2511.06221},
-    archivePrefix={arXiv},
-    primaryClass={cs.AI},
-}```
----
-## 🎯 **Key Improvements**
-### **1. Performance (vLLM)**
-- **10-20x faster** inference compared to standard transformers
-- Better memory management with `gpu_memory_utilization=0.9`
-- Optimized for batch processing and long contexts
-### **2. Output Parsing**
-The `parse_model_output()` function:
-- ✅ Extracts `<think>` tags for reasoning sections
-- ✅ Identifies code blocks with ` ``` ` markers
-- ✅ Separates regular text content
-- ✅ Handles nested and multiple sections
-### **3. UI Enhancements**
-#### **Thinking Sections** 🤔
-- Collapsed by default (orange/yellow theme)
-- Click to expand and see reasoning
-- Monospace font for better readability
-#### **Code Blocks** 💻
-- Open by default (blue theme)
-- **Copy button** - One-click clipboard copy
-- **Download button** - Save as `.py`, `.js`, `.html`, etc.
-- Language-aware file extensions
-- Syntax highlighting ready (add Prism.js if needed)
-#### **Text Sections** 📝
-- Clean, readable font
-- Proper line spacing
-- Subtle borders and shadows
-### **4. Production-Ready Features**
-- Error handling with user-friendly messages
-- Queue management for multiple users
-- Responsive design
-- Accessible controls
-- Example problems pre-loaded
----
-## 📊 **Expected Performance**
-| Metric | Before (Transformers) | After (vLLM) | Improvement |
-|--------|----------------------|--------------|-------------|
-| **First Token Latency** | ~5-8s | ~0.5-1s | **8-10x faster** |
-| **Generation Speed** | ~10-15 tokens/s | ~100-150 tokens/s | **10x faster** |
-| **Total Time (8K tokens)** | ~400-600s | ~40-80s | **10x faster** |
-| **Memory Usage** | ~8-10GB | ~6-8GB | **More efficient** |
-For your 400s generation time on T4, vLLM should bring it down to **40-80 seconds**! 🎉
----
-## 🚀 **Quick Deployment**
-1. Upload the three files to your HuggingFace Space
-2. Select **Nvidia T4 - small** hardware ($0.40/hour)
-3. Wait for build (~5-10 minutes for vLLM compilation)
-4. Enjoy blazing-fast inference! ⚡
-The vLLM compilation might take a bit longer on first build, but the runtime performance will be dramatically better!

 ---
+title: VibeThinker-1.5B Competitive Coding Assistant
+emoji: 🧠
+colorFrom: indigo
 colorTo: purple
 sdk: gradio
 sdk_version: 5.49.1
 license: mit
 ---
+# 🧠 VibeThinker-1.5B Competitive Coding Assistant
+An interactive demo of **VibeThinker-1.5B** optimized for competitive programming challenges.
+## ⚡ Performance Highlights
+- **AIME24**: 80.3 (surpasses DeepSeek R1's 79.8)
+- **AIME25**: 74.4 (vs DeepSeek R1's 70.0)
+- **LiveCodeBench V6**: 51.1 (competitive coding)
+- **Training Cost**: Only $7,800 USD
+- **Parameters**: 1.5B (400× smaller than DeepSeek R1)
+## 🎯 What It's Best At
+✅ **Competitive Programming**: LeetCode, Codeforces, AtCoder-style algorithm problems
+✅ **Python Coding Challenges**: Problems with clear input/output specifications
+✅ **Mathematical Reasoning**: Complex proofs and formal reasoning tasks
+✅ **Algorithm Design**: Dynamic programming, graph algorithms, optimization problems
+## ⚠️ Important Limitations
+This model is **specialized for competitive programming**, not general software development:
+❌ Not suitable for: Building applications, debugging real codebases, using specific libraries
+❌ Limited knowledge: Low encyclopedic knowledge, Python-focused training
+❌ Overthinking tendency: May generate verbose reasoning for simple tasks
+❌ Narrow scope: Optimized for benchmark-style problems, not production code
+*See [community feedback analysis](https://www.reddit.com/r/LocalLLaMA/comments/1ou1emx/) for detailed real-world testing insights*
+## 🚀 Features
+- **🧠 Intelligent Parsing**: Automatic separation of reasoning and solution
+- **📊 Token Tracking**: Real-time stats on generation time and token usage
+- **💻 Clean Code Display**: Syntax-highlighted, copyable/downloadable code blocks
+- **📱 Responsive Design**: Modern UI with collapsible reasoning sections
+- **🎨 High Contrast**: Readable output with dark code blocks on white background
+- **🔄 Loop Detection**: Automatically detects and truncates repetitive output
+## 🛠️ Technical Details
+### Model Information
+- **Base Model**: Qwen2.5-Math-1.5B
+- **Training Method**: Spectrum-to-Signal Principle (SSP)
+  - Supervised Fine-Tuning (SFT) for solution diversity
+  - Reinforcement Learning (RL) for correct reasoning paths
+- **Inference Engine**: Standard `transformers` library (PyTorch)
+- **Token Efficiency**: Configurable thinking depth via prompt hints
+### Hardware Requirements
+- **Recommended**: Nvidia T4 - small (16 GB VRAM)
+- **Memory Usage**: ~3-4 GB VRAM (1.5B params in float16)
+- **Cost**: $0.40/hour on HuggingFace Spaces
+### Implementation
+```python
+# Clean, simple transformers implementation
+- torch.float16 for efficiency
+- device_map="auto" for automatic GPU placement
+- Repetition penalty (1.1) to reduce loops
+- Automatic loop detection and truncation
+```
+## 📖 Usage Tips
+### For Best Results:
+1. **Frame problems competitively**: Clear input/output, edge cases, constraints
+2. **Adjust thinking tokens**:
+   - 1024-2048 for quick, simple problems
+   - 3072-4096 for standard algorithm challenges
+   - 6144-8192 for complex multi-step reasoning
+3. **Use Python**: Model trained primarily on Python code
+4. **Specify format**: Request specific output format (function, class, test cases)
+### Example Prompts:
+```
+✅ Good: "Write a function to find the longest increasing subsequence.
+         Include time/space complexity analysis and test with [10,9,2,5,3,7,101,18]"
+✅ Good: "Implement Dijkstra's algorithm with a min-heap. Handle disconnected graphs."
+❌ Poor: "Debug my React app" (not its purpose)
+❌ Poor: "How do I use pandas?" (limited library knowledge)
+```
+## 🔗 Resources
+- **Model**: [WeiboAI/VibeThinker-1.5B](https://huggingface.co/WeiboAI/VibeThinker-1.5B)
+- **Paper**: [arXiv:2511.06221](https://arxiv.org/abs/2511.06221)
+- **GitHub**: [WeiboAI/VibeThinker](https://github.com/WeiboAI/VibeThinker)
+- **License**: MIT
+## 🙏 Credits
+Developed by **WeiboAI**. This Space demonstrates the model with a clean interface and enhanced user experience.
+## 📝 Citation
+```bibtex
+@article{vibethinker2025,
+  title={Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B},
+  author={WeiboAI Team},
+  journal={arXiv preprint arXiv:2511.06221},
+  year={2025}
+}
+```