Read Later Stack
updated
Demystifying Reinforcement Learning in Agentic Reasoning
Paper
•
2510.11701
•
Published
•
32
Note
Added
Self-Improving LLM Agents at Test-Time
Paper
•
2510.07841
•
Published
•
10
Making Mathematical Reasoning Adaptive
Paper
•
2510.04617
•
Published
•
23
DocReward: A Document Reward Model for Structuring and Stylizing
Paper
•
2510.11391
•
Published
•
27
Note
Added Obsidian
Skill-Targeted Adaptive Training
Paper
•
2510.10023
•
Published
•
10
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning
for LLMs
Paper
•
2510.11696
•
Published
•
179
DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel
Translation
Paper
•
2510.09116
•
Published
•
96
FlashVSR: Towards Real-Time Diffusion-Based Streaming Video
Super-Resolution
Paper
•
2510.12747
•
Published
•
39
Dr.LLM: Dynamic Layer Routing in LLMs
Paper
•
2510.12773
•
Published
•
32
A Survey of Vibe Coding with Large Language Models
Paper
•
2510.12399
•
Published
•
50
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic
Tasks
Paper
•
2510.12635
•
Published
•
17
LLM Reasoning for Machine Translation: Synthetic Data Generation over
Thinking Tokens
Paper
•
2510.11919
•
Published
•
5
HoneyBee: Data Recipes for Vision-Language Reasoners
Paper
•
2510.12225
•
Published
•
11
Diffusion Transformers with Representation Autoencoders
Paper
•
2510.11690
•
Published
•
166
Don't Just Fine-tune the Agent, Tune the Environment
Paper
•
2510.10197
•
Published
•
29
InteractiveOmni: A Unified Omni-modal Model for Audio-Visual Multi-turn
Dialogue
Paper
•
2510.13747
•
Published
•
30
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully
Open MLLMs
Paper
•
2510.13795
•
Published
•
58
The Art of Scaling Reinforcement Learning Compute for LLMs
Paper
•
2510.13786
•
Published
•
32
When Models Lie, We Learn: Multilingual Span-Level Hallucination
Detection with PsiloQA
Paper
•
2510.04849
•
Published
•
115
Agentic Entropy-Balanced Policy Optimization
Paper
•
2510.14545
•
Published
•
106
From Pixels to Words -- Towards Native Vision-Language Primitives at
Scale
Paper
•
2510.14979
•
Published
•
67
TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar
Paper
•
2510.14972
•
Published
•
35
Paper
•
2510.13998
•
Published
•
58
Attention Is All You Need for KV Cache in Diffusion LLMs
Paper
•
2510.14973
•
Published
•
41
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model
Paper
•
2510.14528
•
Published
•
111
Large Language Models Do NOT Really Know What They Don't Know
Paper
•
2510.09033
•
Published
•
17
Beyond Correctness: Evaluating Subjective Writing Preferences Across
Cultures
Paper
•
2510.14616
•
Published
•
13
LiveResearchBench: A Live Benchmark for User-Centric Deep Research in
the Wild
Paper
•
2510.14240
•
Published
•
12
Qwen3Guard Technical Report
Paper
•
2510.14276
•
Published
•
15
MoM: Mixtures of Scenario-Aware Document Memories for
Retrieval-Augmented Generation Systems
Paper
•
2510.14252
•
Published
•
3
RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval
Augmented Generation Systems
Paper
•
2510.13910
•
Published
•
2
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper
•
2510.16872
•
Published
•
109
Glyph: Scaling Context Windows via Visual-Text Compression
Paper
•
2510.17800
•
Published
•
68
When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM
Ensembling
Paper
•
2510.15346
•
Published
•
34
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented
Generation
Paper
•
2510.17354
•
Published
•
35
Visual Autoregressive Models Beat Diffusion Models on Inference Time
Scaling
Paper
•
2510.16751
•
Published
•
21
FineVision: Open Data Is All You Need
Paper
•
2510.17269
•
Published
•
75
Beyond Pipelines: A Survey of the Paradigm Shift toward Model-Native
Agentic AI
Paper
•
2510.16720
•
Published
•
8
UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action
Paper
•
2510.17790
•
Published
•
6
AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning
Paper
•
2510.16156
•
Published
•
2
Knowledge-based Visual Question Answer with Multimodal Processing,
Retrieval and Filtering
Paper
•
2510.14605
•
Published
•
5