Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue

Maksim Eremeev; Ilya Valmianski; Xavier Amatriain; Anitha Kannan

Injecting knowledge into language generation: a case study in auto-charting after-visit care instructions from medical dialogue

Maksim Eremeev, Ilya Valmianski, Xavier Amatriain, Anitha Kannan

22 Sept 2022 (modified: 22 Jun 2025)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: neural language generation, sequence-to-sequence, knowledge injection, medical dialogue data, care plan generation, EHR auto-charting

TL;DR: We propose an approach for injecting domain knowledge into neural autoregressive language models using marginal probability regularization during training and apply it to the care plan generation task.

Abstract: Factual correctness is often the limiting factor in practical applications of natural language generation in high-stake domains such as healthcare. An essential requirement for maintaining factuality is the ability to deal with rare tokens. This paper focuses on rare tokens that appear in both the source and reference sequences, and which, when missed during generation, can hamper the factual correctness of the generated text. Starting from our fundamental premise that high-stake domains are also knowledge-rich, we show how to use knowledge to (a) identify what rare tokens that appear in both source and reference are important and (b) uplift their conditional probability. We introduce the ``utilization rate'' that encodes knowledge and serves as a regularizer by maximizing the marginal probability of selected tokens. We present a study in a knowledge-rich domain of healthcare, where we tackle the problem of generating after-visit care instructions based on patient-doctor dialogues. We verify that, in our dataset, specific medical concepts with high utilization rates are underestimated by conventionally trained sequence to sequence models. We observe that correcting this with our approach to knowledge injection reduces the uncertainty of the model and improves factuality and coherence without negatively impacting fluency.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/injecting-knowledge-into-language-generation/code)

4 Replies

Loading