Personalized Language Generation via Bayesian Metric Augmented Retrieval

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Retrieval Augmented Generation, Bayesian Metric Learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Our paper presents a Bayesian adaptation of Retrieval Augmented Generation (RAG) designed to capture the characteristics of each user, encompassing factors such as their educational background and professions. We model each individual's characteristics using specific perturbations of the local metric of the embedding space. This perturbation introduces a crucial shift in the distance evaluation between the query's and the document's embedding, leading to different pertinent rankings of the retrieved documents. We propose a Bayesian learning procedure that assimilates user feedback and continuously enhances our estimation of the user-specific metric. In the beginning, when there is no information about the user, we use a diverse retrieval method for generation. After this burn-in phase, we learn a Bayesian posterior estimate of the metric, and inject this metric into the nearest neighbor search for document retrieval. This additional layer of metric information acquisition leads to empirical improvement in the retrieval quality and in the performance of the generated text on multiple concept explanation tasks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9327
Loading