Read Later Stack - a Sheiphan Collection

Sheiphan 's Collections

Read Later Stack

Read Later Stack

updated Oct 22, 2025

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published Oct 13, 2025 • 32

Note Added
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9, 2025 • 10
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6, 2025 • 23
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13, 2025 • 27

Note Added Obsidian
Skill-Targeted Adaptive Training

Paper • 2510.10023 • Published Oct 11, 2025 • 10
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13, 2025 • 179
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation

Paper • 2510.09116 • Published Oct 10, 2025 • 96
FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

Paper • 2510.12747 • Published Oct 14, 2025 • 39
Dr.LLM: Dynamic Layer Routing in LLMs

Paper • 2510.12773 • Published Oct 14, 2025 • 32
A Survey of Vibe Coding with Large Language Models

Paper • 2510.12399 • Published Oct 14, 2025 • 50
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

Paper • 2510.12635 • Published Oct 14, 2025 • 17
LLM Reasoning for Machine Translation: Synthetic Data Generation over Thinking Tokens

Paper • 2510.11919 • Published Oct 13, 2025 • 5
HoneyBee: Data Recipes for Vision-Language Reasoners

Paper • 2510.12225 • Published Oct 14, 2025 • 11
Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 166
Don't Just Fine-tune the Agent, Tune the Environment

Paper • 2510.10197 • Published Oct 11, 2025 • 29
InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn Dialogue

Paper • 2510.13747 • Published Oct 15, 2025 • 30
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

Paper • 2510.13795 • Published Oct 15, 2025 • 58
The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published Oct 15, 2025 • 32
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published Oct 6, 2025 • 115
Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published Oct 16, 2025 • 106
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale

Paper • 2510.14979 • Published Oct 16, 2025 • 67
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar

Paper • 2510.14972 • Published Oct 16, 2025 • 35
BitNet Distillation

Paper • 2510.13998 • Published Oct 15, 2025 • 58
Attention Is All You Need for KV Cache in Diffusion LLMs

Paper • 2510.14973 • Published Oct 16, 2025 • 41
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16, 2025 • 111
Large Language Models Do NOT Really Know What They Don't Know

Paper • 2510.09033 • Published Oct 10, 2025 • 17
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures

Paper • 2510.14616 • Published Oct 16, 2025 • 13
LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild

Paper • 2510.14240 • Published Oct 16, 2025 • 12
Qwen3Guard Technical Report

Paper • 2510.14276 • Published Oct 16, 2025 • 15
MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems

Paper • 2510.14252 • Published Oct 16, 2025 • 3
RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems

Paper • 2510.13910 • Published Oct 15, 2025 • 2
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published Oct 19, 2025 • 109
Glyph: Scaling Context Windows via Visual-Text Compression

Paper • 2510.17800 • Published Oct 20, 2025 • 68
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling

Paper • 2510.15346 • Published Oct 17, 2025 • 34
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation

Paper • 2510.17354 • Published Oct 20, 2025 • 35
Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling

Paper • 2510.16751 • Published Oct 19, 2025 • 21
FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published Oct 20, 2025 • 75
Beyond Pipelines: A Survey of the Paradigm Shift toward Model-Native Agentic AI

Paper • 2510.16720 • Published Oct 19, 2025 • 8
UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action

Paper • 2510.17790 • Published Oct 20, 2025 • 6
AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning

Paper • 2510.16156 • Published Oct 17, 2025 • 2
Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering

Paper • 2510.14605 • Published Oct 16, 2025 • 5