Abstract: In Large Language Models (LLMs), text generation that involves knowledge representation is often fraught with the risk of ''hallucinations'', where models confidently produce erroneous or fabricated content. These inaccuracies often stem from intrinsic biases in the pre-training stage or from the incorporation of human preference biases during the fine-tuning process. To mitigate these issues, we take inspiration from Goldman's causal theory of knowledge, which asserts that knowledge is not merely about having a true belief but also involves a causal connection between the belief and the truth of the proposition. We instantiate this theory within the context of Knowledge Question Answering (KQA) by constructing a causal graph that delineates the pathways between the candidate knowledge and belief. Through the application of the do-calculus rules from structural causal models, we devise an unbiased estimation framework based on this causal graph, thereby establishing a methodology for knowledge modeling grounded in causal inference. The resulting CORE framework (short for ``Causal knOwledge REasoning'') is comprised of four essential components: question answering, causal reasoning, belief scoring, and refinement. Together, they synergistically improve the KQA system by fostering faithful reasoning and introspection. Extensive experiments are conducted on ScienceQA and HotpotQA datasets, which demonstrate the effectiveness and rationality of the CORE framework.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Content] Vision and Language
Relevance To Conference: Our work focuses on reasoning and introspection on LLMs to enhance Knowledge Question Answering (KQA) tasks across various media forms (e.g. texts, images). The proposed Causal knOwledge REasoning (CORE) framework, thereby enhances the model's ability to handle multimodal inputs in KQA tasks, improving the faithfulness and reliability of the generated content. This advancement represents a significant step forward in developing more sophisticated, introspective AI systems capable of navigating the complexities of multimedia/multimodal processing.
Supplementary Material: zip
Submission Number: 3150
Loading