Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction
Abstract: Large language models (LLMs) have demonstrated promising in-context learning capabilities, especially with instructive prompts. However, recent studies have shown that existing large models still face challenges in specific information extraction (IE) tasks. Moreover, it could have more effectively utilized various prompts such as instruction tuning, diverse demonstrations of in-context learning, and long-range token sequences for assisting language modeling in understanding context. In this study, we propose DILUIE, a unified information extraction framework based on in-context learning with diverse demonstration examples. DILUIE is encoded with an EVA attention mechanism and incremental encoding technology. Based on the constructed diverse demonstrations, we expand the size of instances efficiently in both instruction tuning and in-context learning to gain insights into the potential benefits of utilizing diverse information extraction datasets. To deepen the understanding of context, we further design three auxiliary tasks to assist in aligning contextual semantics. Experimental results demonstrate that DILUIE achieves 2.23 and 2.53% improvements in terms of Micor-/Macor-F1 on average relative to the current state-of-the-art baseline, which also significantly outperforms the GPT-3.5-turbo in zero-shot settings, and the average token length of achieving the best performance over tasks is around 15k. Furthermore, we observe that in-context learning shows enhanced performance when provided with more demonstrations during multiple-shot instruction tuning (8 k). Additionally, increasing the length of instructions (10 k) can result in a more substantial improvement in the upper limits of scaling for in-context learning. Code is available on https://github.com/Phevos75/DILUIE.
Loading