Abstract: Real-world sequential decision-making often involves param-
eterized action spaces that require both, decisions regarding
discrete actions and decisions about continuous action pa-
rameters governing how an action is executed. Existing ap-
proaches exhibit severe limitations in this setting—planning
methods demand hand-crafted action models, and standard
reinforcement learning (RL) algorithms are designed for ei-
ther discrete or continuous actions but not both, and the
few RL methods that handle parameterized actions typi-
cally rely on domain-specific engineering and fail to exploit
the latent structure of these spaces. This paper extends the
scope of RL algorithms to long-horizon, sparse-reward set-
tings with parameterized actions by enabling agents to au-
tonomously learn both state and action abstractions online.
We introduce algorithms that progressively refine these ab-
stractions during learning, increasing fine-grained detail in
the critical regions of the state–action space where greater
resolution improves performance. Across several continuous-
state, parameterized-action domains, our abstraction-driven
approach enables TD(λ) to achieve markedly higher sample
efficiency than state-of-the-art baselines.
Loading