Hybrid-SET: Few-Shot Example Selection Combining Sentence Similarity and Set Coverage - A Case Study on Material Science Domain
Abstract: Named entity recognition (NER) in the domain of materials science and chemistry presents significant challenges, including limited availability of annotated training data and the necessity for domain-specific expertise during the annotation process. Large language models (LLMs) demonstrate the capability to perform various tasks with minimal labeled examples, a technique known as in-context learning (ICL). However, the ICL performance is highly sensitive to the given examples, highlighting the importance of an effective selection strategy. This paper introduces Hybrid-SET, a novel selection approach that combines example selection with sentence representation similarity and set-level coverage. Experimental results indicate that Hybrid-SET surpasses both conventional supervised methods and existing selection methods. Notably, the performance exhibits a significant improvement in recognizing domain-specific entities.
Loading