Keywords: Eye-tracking, LLM, DPO
Abstract: Large language models often require alignment with explicit human preferences, which can be sparse and costly. We propose a framework to leverage eye-tracking data as an implicit feedback signal to tune LLMs for controlled sentiment generation using Direct Preference Optimization. Our study demonstrates that eye-tracking feedback can be a valuable signal for tuning LLMs. This motivates future research to investigate the impact of eye-tracking feedback on various tasks, highlighting the potential of integrating eye-tracking data with LLMs to improve their performance and alignment with human preferences.
Submission Number: 61
Loading