Automated Reward Design for Gran Turismo

Michel Ma; Takuma Seno; Kaushik Subramanian; Peter R. Wurman; Peter Stone; Craig Sherstan

Automated Reward Design for Gran Turismo

Michel Ma, Takuma Seno, Kaushik Subramanian, Peter R. Wurman, Peter Stone, Craig Sherstan

Published: 23 Sept 2025, Last Modified: 22 Nov 2025LAWEveryoneRevisionsBibTeXCC BY 4.0

Keywords: reinforcement learning, LLM, VLM, Eureka, Reward design, reward shaping, autonomous driving, autonomous racing, Gran Turismo

TL;DR: This paper explores the use of a novel automated reward design system to achieve superhuman performance in Gran Turismo 7 using LLMs.

Abstract: When designing reinforcement learning (RL) agents, a designer communicates the desired agent behavior through the definition of reward functions - numerical feedback given to the agent as reward or punishment for its actions. However, mapping desired behaviors to reward functions can be a difficult process, especially in complex environments such as autonomous racing. In this paper, we demonstrate how current foundation models can effectively search over a space of reward functions to produce desirable RL agents for the Gran Turismo 7 racing game, given only text-based instructions. In this paper, we demonstrate how an LLM-based approach can be used to build an interactive system that iteratively adapts the agent’s behavior to match the designer’s wishes. Through a combination of LLM-based reward generation, VLM preference-based evaluation, and human feedback we demonstrate how our system can be used to produce racing agents competitive with GT Sophy, a champion-level RL racing agent, as well as generate novel behaviors, paving the way for practical automated reward design in real world applications.

Submission Type: Research Paper (4-9 Pages)

Submission Number: 65

Loading