arxiv:2601.08225

User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale

Published on Jan 13

· Submitted by

Minbyul Jeong on Jan 14

upstage

Upvote

Authors:

Sungrae Park

Abstract

Large reasoning models enable scalable multi-turn dialogue generation through automated task-oriented simulation and user-oriented behavioral modeling for enhanced human-agent interaction datasets.

AI-generated summary

The recent paradigm shift toward large reasoning models (LRMs) as autonomous agents has intensified the demand for sophisticated, multi-turn tool-use capabilities. Yet, existing datasets and data-generation approaches are limited by static, predefined toolsets that cannot scale to the complexity of open-ended human-agent collaboration. To address this, we initially developed a framework for automated task-oriented multi-turn dialogue generation at scale, utilizing an LRM-based simulator to dynamically generate high-value, domain-specific tools to solve specified tasks. However, we observe that a purely task-oriented design often results in "solely task-solving" trajectories, where the agent completes the objective with minimal interaction, failing to generate the high turn-count conversations seen in realistic scenarios. To bridge this gap, we shift toward a user-oriented simulation paradigm. By decoupling task generation from a dedicated user simulator that mimics human behavioral rules - such as incremental request-making and turn-by-turn feedback - we facilitate more authentic, extended multi-turn dialogues that reflect the iterative nature of real-world problem solving. Our generation pipeline operates as a versatile, plug-and-play module capable of initiating generation from any state, ensuring high scalability in producing extended tool-use data. Furthermore, by facilitating multiple task completions within a single trajectory, it yields a high-density dataset that reflects the multifaceted demands of real-world human-agent interaction.

View arXiv page View PDF Add to collection

Community

Minbyul

Paper submitter 1 day ago

While large language models have shown remarkable progress in tool use, maintaining high-quality, user-centric multi-turn conversations at scale remains a significant challenge.

Our work focuses on:
(1) Generating high-fidelity multi-turn dialogue datasets designed for practical tool-use scenarios.
(2) Enhancing model performance in complex, user-oriented interactions.
(3) Providing insights into scaling dialogue generation without compromising on user experience.

Check out the full paper here: https://arxiv.org/abs/2601.08225

mindplay

about 18 hours ago

This sounds silly.

If you fine tune the model for "high turn-count conversations", it will (literally) learn to converse about things it could have just answered instead.

LLMs don't have any thought process - they don't know what they know before predicting the next token.