Generalized Hyperbolic Discounting for Delay-Sensitive Reinforcement Learning

Published: 04 Jun 2024, Last Modified: 22 Jul 2024Finding the Frame: RLC 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: reward discounting, hyperbolic discounting, rachlin
TL;DR: We propose a sensitivite-to-(time)delay version of hyperbolic discounting
Abstract: Value estimates at multiple timescales can help create advanced discounting functions and allow agents to form more effective predictive models of their environment. While exponential discounting has been widely used because of its time-consistent preferences and ease of use, hyperbolic discounting has been shown to capture human and animal preferences more accurately. Both the exponential and hyperbolic reward discounting functions are single-parameter models. However, more sophisticated, two-parameter hyperbolic discounting functions have been proposed that provide the best fit to observed human behavior. In this work, we propose a generalized hyperbolic discounting framework, incorporating both a discount factor and a sensitivity-to-delay parameter through which agents have different valuation of the same time delay it takes to receive a reward. We conduct extensive evaluations across a variety of learning tasks (high dimensional input, generalization), analyze the suitability of different discounting functions to these tasks, and present new insights on how the functional form of discounting affects an agent's performance.
Submission Number: 37
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview