Ledger-Guided Memory for Agentic Web RL: Structured Cost–Risk Accounting in Long-Horizon Decision Models
DOI:
https://doi.org/10.71465/fapm750Keywords:
Structured memory, web agents, long-horizon RL, budget accounting, risk tracking, decision models, state extractionAbstract
Long-horizon web tasks require consistent tracking of intermediate commitments (selected filters, filled fields, account states) and ongoing budgets (requests, latency, monetary fees) while avoiding repeated risky actions. We propose Ledger Memory Agent (LeMA), an agentic RL model equipped with a structured memory ledger that records (a) action commitments, (b) remaining multi-cost budgets, and (c) accumulated risk signals. The ledger is updated by a learned parser that extracts key-value state facts from the DOM/text and attaches each with provenance. Policy learning is augmented with (i) a ledger-consistency loss penalizing actions that contradict recorded commitments, and (ii) a budget–risk controller that modulates exploration based on remaining budgets and predicted risk spikes. Recommended evaluation uses 1,000–2,000 tasks with longer horizons (12–30 steps), comparing against standard recurrent/Transformer memory agents. Metrics include success, redundant action rate, cost overshoot, failure frequency, and ledger factual accuracy (F1 on extracted state facts). LeMA improves reliability by making cost–risk accounting explicit rather than implicit in hidden states.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Andrew J. Patel, Emily R. Thompson, Benjamin K. Lee (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.