Temporal Difference Learning with Adaptive Horizon Control for Stable Multi-Step Collaborative Decisions

Wei Chen; Jun Jie Tan; Li Wang

doi:10.71465/fair778

Authors

Wei Chen School of Computing, National University of Singapore, Singapore 117417, Singapore Author
Jun Jie Tan School of Computing, National University of Singapore, Singapore 117417, Singapore Author
Li Wang School of Computing, National University of Singapore, Singapore 117417, Singapore Author

DOI:

https://doi.org/10.71465/fair778

Keywords:

Temporal difference learning, Adaptive horizon, Reinforcement learning, Multi-step decision-making, Stability

Abstract

Instability in long-horizon decision-making often arises from improper credit assignment across extended time steps. This study explores an adaptive horizon control mechanism integrated with temporal difference (TD) learning to improve stability in multi-step collaborative tasks. Instead of using a fixed discount factor, the method dynamically adjusts the effective planning horizon based on reward sparsity and task progression signals. The approach is validated on 10,300 multi-step decision sequences with horizon lengths ranging from 10 to 50 steps. Compared with standard TD learning, the proposed method reduces cumulative reward variance by 25.9% and improves final task success rate by 14.6%. Furthermore, convergence is achieved with fewer training iterations, indicating improved learning efficiency. The results suggest that adaptive horizon control is a practical solution for stabilizing long-range coordination.

Downloads

Download data is not yet available.

Temporal Difference Learning with Adaptive Horizon Control for Stable Multi-Step Collaborative Decisions

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

Journal Information

Journal Name : Frontiers in Artificial Intelligence Research

Latest publications

Information

Make a Submission

Keywords