Generative compositor for few-shot visual information extraction

Published: 01 Jan 2025, Last Modified: 26 Sept 2025Pattern Recognit. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•A novel VIE method, called Generative Compositor, leverages layout and prompt priors.•Three pre-training tasks to improve the model’s spatial contextual capabilities.•A prompt-aware resampler for distilling and merging the multi-modal embeddings.•Significant improvements in few-shot settings.
Loading