Improving Precision in Language Models Learning from Invalid Samples

Published: 27 Oct 2023, Last Modified: 22 Nov 2023GenBio@NeurIPS2023 PosterEveryoneRevisionsBibTeX
Keywords: domain-specialized-language-models; refinement; invalid2valid
TL;DR: leverage invalid samples to improve language models precision
Abstract: Language Models are powerful generative tools capable of learning intricate patterns from vast amounts of unstructured data. Nevertheless, in domains that demand precision, such as science and engineering, the primary objective is to obtain an exact and accurate answer. Precision takes precedence in these contexts. In specialized tasks like chemical compound generation, the emphasis is on output accuracy rather than response diversity. Traditional self-refinement methods are ineffective for such domain-specific input/output pairs, unlike general language tasks. In this study, we introduce invalid2valid, a powerful and general post-processing mechanism that can significantly enhance precision in language models for input/output tasks spanning different domains and specialized applications.
Submission Number: 3