Papers
arxiv:2512.23611

Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing

Published on Dec 29, 2025
Authors:
,
,
,
,
,
,
,
,
,

Abstract

Enabling Large Language Models (LLMs) to reliably invoke external tools remains a critical bottleneck for autonomous agents. Existing approaches suffer from three fundamental challenges: expensive human annotation for high-quality trajectories, poor generalization to unseen tools, and quality ceilings inherent in single-model synthesis that perpetuate biases and coverage gaps. We introduce InfTool, a fully autonomous framework that breaks these barriers through self-evolving multi-agent synthesis. Given only raw API specifications, InfTool orchestrates three collaborative agents (User Simulator, Tool-Calling Assistant, and MCP Server) to generate diverse, verified trajectories spanning single-turn calls to complex multi-step workflows. The framework establishes a closed loop: synthesized data trains the model via Group Relative Policy Optimization (GRPO) with gated rewards, the improved model generates higher-quality data targeting capability gaps, and this cycle iterates without human intervention. Experiments on the Berkeley Function-Calling Leaderboard (BFCL) demonstrate that InfTool transforms a base 32B model from 19.8% to 70.9% accuracy (+258%), surpassing models 10x larger and rivaling Claude-Opus, and entirely from synthetic data without human annotation.

Community

Sign up or log in to comment

Models citing this paper 19

Browse 19 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.23611 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 2