Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering

ACL ARR 2024 June Submission4706 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) have demonstrated remarkable performance across various real-world tasks. However, recent studies reveal that LLMs often struggle to fully comprehend and effectively utilize their input contexts, resulting in responses that lack faithfulness or suffer from hallucination. This difficulty becomes particularly evident when the contexts are lengthy or contain distracting information, which can divert LLMs from fully capturing essential evidence. Most prior work focuses on designing effective prompts to guide LLMs in utilizing contextual information more faithfully. For instance, iterative prompting highlights key information through two high-level prompting steps that first ask the LLM to identify important pieces of context and then derive answers accordingly. However, prompting methods are constrained to highlighting key information implicitly in token space, which is often insufficient to fully steer the model's attention. To improve model faithfulness more reliably, we propose AutoPASTA, a method that automatically identifies contextual key information and explicitly highlights it by steering the model's attention scores. Similar to prompting, AutoPASTA is applied at inference time and does not require changing any model parameters. Our experiments on open-book QA demonstrate that AutoPASTA can effectively guide models to grasp essential contextual information, leading to substantially improved model faithfulness and performance, e.g., an average improvement of 11.26% for LLAMA3-8B-Instruct. Code will be publicly available.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: inference methods, reading comprehension
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 4706
Loading