Context-Aware Chatbot Extension Leveraging HTML Data and Retrieval-Augmented Generation (RAG)

28 Oct 2024 (modified: 05 Nov 2024)THU 2024 Fall AML SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Context - Aware, Chatbot Extension, HTML Data, Retrieval - Augmented Generation
TL;DR: Browser - based chatbot extension using HTML data and RAG, powered by a fine - tuned LLaMA 3 model, addresses context - awareness in chatbots for better web information retrieval.
Abstract: In the digital age, the need for efficient information retrieval has become increasingly crucial. With the vast amount of information available on the web, users often find themselves in situations where they are seeking quick answers or additional information related to the current page they are viewing. However, the process of switching between multiple pages or looking up external resources can be disruptive and time- consuming, which significantly affects the user experience.Traditional chatbots, while being useful to some extent, often lack the ability to adapt to the context. Specifically, they struggle to offer responses directly based on a user’s past and present interactions with a website. This shortcoming has led to a demand for more intelligent information retrieval systems. To address these issues, the objective of this project is to develop a browser- based chatbot extension. This chatbot will be designed to retrieve and synthesize information from two main sources: previously parsed HTML data and traditional Retrieval- Augmented Generation (RAG) sources. By leveraging both of these sources, the chatbot aims to provide accurate, context- sensitive responses. Moreover, it will be powered by a fine- tuned LLaMA 3 model. This model, which is capable of handling HTML context, is expected to enhance the user experience by providing contextual insights and enabling more efficient information retrieval. In essence, this project bridges the gap between the user’s need for quick and relevant information and the limitations of existing chatbot technologies by integrating stored HTML context and external data sources, thus offering users streamlined access to relevant information through a browser extension. This development is not only important for improving the user experience in day- to- day web browsing but also has the potential to impact the field of information retrieval and natural language processing, setting new standards for context aware chatbot systems.
Submission Number: 31
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview