Abstract: We propose a novel application of Pre-trained Language Models (PLMs) to generate analogies and study how to design effective prompts to prompt a PLM to generate a source concept analogous to a given target concept as well as to generate an explanation of the similarity between given pair of target concept and source concept. We found that it is feasible to prompt a GPT-3 PLM to generate meaningful analogies and the best prompts tend to be precise imperative statements especially with low temperature setting. We systematically analyzed the sensitivity of the GPT-3 model to prompt design and temperature and found that the model is particularly sensitive to certain variations (e.g., questions vs. imperative statements). We also investigated the suitability of using the existing reference-based metrics designed for evaluating natural language generation (NLG) to evaluate analogy generation and found that the recent BLEURT score is better than the others. We further propose a promising consensus measure based on diverse prompts and settings, which can be potentially used to both automatically evaluate the generated analogies in the absence of reference text (e.g., in novel domains) and rank a set of generated analogies to select analogies of different characteristics. Overall, our study shows that PLMs offer a promising new way to generate analogies in unrestricted domains, breaking the limitation of existing analogy generation methods in requiring structured representation.
Paper Type: long
0 Replies
Loading