Ledger-Guided Memory for Agentic Web RL: Structured Cost–Risk Accounting in Long-Horizon Decision Models

Andrew J. Patel; Emily R. Thompson; Benjamin K. Lee

doi:10.71465/fapm750

Authors

Andrew J. Patel Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave., Room 38-401, Cambridge, MA 02139, USA Author
Emily R. Thompson Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave., Room 38-401, Cambridge, MA 02139, USA Author
Benjamin K. Lee Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 77 Massachusetts Ave., Room 38-401, Cambridge, MA 02139, USA Author

DOI:

https://doi.org/10.71465/fapm750

Keywords:

Structured memory, web agents, long-horizon RL, budget accounting, risk tracking, decision models, state extraction

Abstract

Long-horizon web tasks require consistent tracking of intermediate commitments (selected filters, filled fields, account states) and ongoing budgets (requests, latency, monetary fees) while avoiding repeated risky actions. We propose Ledger Memory Agent (LeMA), an agentic RL model equipped with a structured memory ledger that records (a) action commitments, (b) remaining multi-cost budgets, and (c) accumulated risk signals. The ledger is updated by a learned parser that extracts key-value state facts from the DOM/text and attaches each with provenance. Policy learning is augmented with (i) a ledger-consistency loss penalizing actions that contradict recorded commitments, and (ii) a budget–risk controller that modulates exploration based on remaining budgets and predicted risk spikes. Recommended evaluation uses 1,000–2,000 tasks with longer horizons (12–30 steps), comparing against standard recurrent/Transformer memory agents. Metrics include success, redundant action rate, cost overshoot, failure frequency, and ledger factual accuracy (F1 on extracted state facts). LeMA improves reliability by making cost–risk accounting explicit rather than implicit in hidden states.

Downloads

Download data is not yet available.

Ledger-Guided Memory for Agentic Web RL: Structured Cost–Risk Accounting in Long-Horizon Decision Models

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

Journal Information

Latest publications

Information

Make a Submission