Reinforcement Learning from Human Feedback for Lane Changing of Autonomous Vehicles in Mixed Traffic
Abstract: The burgeoning field of autonomous driving necessitates the seamless integration of autonomous vehicles (AVs) with human-driven vehicles, calling for more predictable AV behavior and enhanced interaction with human drivers. Human-like driving, particularly during lane-changing maneuvers on highways, is a critical area of research due to its significant impact on safety and traffic flow. Traditional rule-based decision-making approaches often fail to encapsulate the nuanced boundaries of human behavior in diverse driving scenarios, while crafting reward functions for learning-based methods introduces its own set of complexities. This study investigates the application of Reinforcement Learning from Human Feedback (RLHF) to emulate human-like lane-changing decisions in AVs. An initial RL policy is pre-trained to ensure safe lane changes. Subsequently, this policy is employed to gather data, which is then annotated by humans to train a reward model that discerns lane changes aligning with human preferences. This human-informed reward model supersedes the original, guiding the refinement of the policy to reflect human-like preferences. The effectiveness of RLHF in producing human-like lane changes is demonstrated through the development and evaluation of conservative and aggressive lane-changing models within obstacle-rich environments and mixed autonomy traffic scenarios. The experimental outcomes underscore the potential of RLHF to diversify lane-changing behaviors in AVs, suggesting its viability for enhancing the integration of AVs into the fabric of human-driven traffic.
Loading