Incorporating Logic in Online Preference Learning for Safe Personalization of Autonomous Vehicles

Ruya Karagulle, Necmiye Ozay, Nikos Aréchiga, Jonathan A. DeCastro, Andrew Best

Published: 01 Jan 2024, Last Modified: 07 Jun 2024HSCC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Customizing autonomous vehicles to align with user preferences while ensuring safety may significantly impact their adoption. Collecting user preference data by asking a large number of comparison questions can be demanding. In this work, we use active learning along with temporal logic descriptions of constraints to enable safe learning of preferences with a reduced number of questions. We take a Bayesian inference approach combined with Weighted Signal Temporal Logic (WSTL), resulting in a WSTL formula that can rank signals based on user preferences and be used for correct-and-custom-by-construction control synthesis. Our method is practical for formulas and signals with various complexity since we compute STL-related values offline. We provide an upper bound for the number of answers in disagreement with user answers. We demonstrate the performance of our method both on synthetic data and by human subject experiments in an immersive driving simulator. We consider two driving scenarios, one involving a vehicle approaching a pedestrian crossing and the other with an overtake maneuver. Our results over synthetic experiments with ground truth weight valuation show that our query selection algorithm converges faster than random query selection. Human subject study results show an average agreement of 94% with user answers during training, and 79% during validation (which increases to 86% when restricted to high confidence results).