Reinforcement Learning-Guided Coordination Mechanism for LLM-Based Agents in Sequential Decision Tasks
DOI:
https://doi.org/10.71465/fias768Keywords:
Reinforcement learning, Sequential decision-making, Multi-agent coordination, temporal credit assignment, Language agentsAbstract
Coordinating multiple language-driven agents in sequential decision-making scenarios remains challenging due to inconsistent policy updates and delayed feedback signals. This study presents a reinforcement learning-guided coordination mechanism that integrates temporal credit assignment with interaction-aware policy updates. The model is trained on a benchmark dataset of 11,300 sequential decision tasks, including multi-step planning and resource allocation scenarios. A temporal-difference learning scheme is combined with communication-aware reward signals to improve coordination efficiency. Experimental results indicate that the proposed approach increases cumulative task reward by 27.1% and reduces policy oscillation by 35.6% compared to baseline decentralized agents. Furthermore, convergence speed improves by 18%, demonstrating enhanced stability in long-horizon decision processes.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Ka Wai Wong, Chun Ho Chan, Tsz Lok Lee (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.