Zichen Wen's picture

Zichen Wen

zichenwen

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

Accelerating Streaming Video Large Language Models via Hierarchical Token Compression

published a dataset 12 days ago

EPIC-Streaming-Video-Model/HJJ_code

upvoted a paper about 1 month ago

Scaling Agent Learning via Experience Synthesis

View all activity

Organizations

upvoted a paper 5 days ago

Accelerating Streaming Video Large Language Models via Hierarchical Token Compression

Paper • 2512.00891 • Published 7 days ago • 14

upvoted 4 papers about 1 month ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5 • 80

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30 • 114

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Paper • 2510.25760 • Published Oct 29 • 16

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Paper • 2510.19600 • Published Oct 22 • 68

upvoted 3 papers about 2 months ago

AI for Service: Proactive Assistance with AI Glasses

Paper • 2510.14359 • Published Oct 16 • 73

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

Paper • 2510.07143 • Published Oct 8 • 12

upvoted 5 papers 2 months ago

Compose Your Policies! Improving Diffusion-based or Flow-based Robot Policies via Test-time Distribution-level Composition

Paper • 2510.01068 • Published Oct 1 • 19

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Paper • 2510.00515 • Published Oct 1 • 39

Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning

Paper • 2509.23873 • Published Sep 28 • 67

Socratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolution

Paper • 2509.24726 • Published Sep 29 • 19

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26 • 136

upvoted a paper 3 months ago

PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era

Paper • 2509.12989 • Published Sep 16 • 28

upvoted 3 papers 4 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 256

Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts

Paper • 2508.07785 • Published Aug 11 • 28

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Paper • 2508.09987 • Published Aug 13 • 25

upvoted 3 papers 5 months ago

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

Paper • 2507.17512 • Published Jul 23 • 36

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 133

WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization

Paper • 2507.15061 • Published Jul 20 • 60