Style-Unified Meta-In-Context Learning: Improving In-Context Learning Ability by Learning to Unify Output Styles
Abstract: This paper proposes a style-unified meta-in-context learning that enhances In-Context Learning (ICL) ability for language models by learning to unify the output styles. Meta-training for ICL (MetaICL), a method that learns ICL ability for enhancing to follow a few in-context examples, has been proposed. However, the language models trained with MetaICL might not be able to consider information obtained from in-context examples at inference because it is reported that the performance is unaffected when random or flipped outputs are used in a few in-context examples. Our key idea for using in-context information is explicitly giving a relationship between outputs in context and a target output by unifying the output style. Specifically, arbitrary symbols (e.g., integer or word) are inserted into the outputs in context, and we expect the model to focus on examples by learning to output the same symbols at the same positions. To evaluate the proposed method, we create a Japanese dataset containing multiple examples per task. Experiments using a 0.6B Japanese language model demonstrate that the proposed method outperforms the conventional method.
Paper Type: short
Research Area: Generation
Languages Studied: Japanese
0 Replies
Loading