Reinforcement Learning-Guided Coordination Mechanism for LLM-Based Agents in Sequential Decision Tasks

Authors

  • Ka Wai Wong Department of Electronic and Computer Engineering, the Hong Kong University of Science and Technology, Hong Kong SAR, China Author
  • Chun Ho Chan Department of Electronic and Computer Engineering, the Hong Kong University of Science and Technology, Hong Kong SAR, China Author
  • Tsz Lok Lee Department of Electronic and Computer Engineering, the Hong Kong University of Science and Technology, Hong Kong SAR, China Author

DOI:

https://doi.org/10.71465/fias768

Keywords:

Reinforcement learning, Sequential decision-making, Multi-agent coordination, temporal credit assignment, Language agents

Abstract

Coordinating multiple language-driven agents in sequential decision-making scenarios remains challenging due to inconsistent policy updates and delayed feedback signals. This study presents a reinforcement learning-guided coordination mechanism that integrates temporal credit assignment with interaction-aware policy updates. The model is trained on a benchmark dataset of 11,300 sequential decision tasks, including multi-step planning and resource allocation scenarios. A temporal-difference learning scheme is combined with communication-aware reward signals to improve coordination efficiency. Experimental results indicate that the proposed approach increases cumulative task reward by 27.1% and reduces policy oscillation by 35.6% compared to baseline decentralized agents. Furthermore, convergence speed improves by 18%, demonstrating enhanced stability in long-horizon decision processes.

Downloads

Published

2026-04-05