CollageNoter: Real-Time and Adaptive Collage Layout Design for Screenshot-Based E-Note-Taking

Published: 2025, Last Modified: 15 Jan 2026AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To enhance the processing of complex multi-modal documents (e.g. e-books, long web pages, etc.), it is an efficient way for users to take digital screenshots of key parts and reorganize them into a new collage E-Note. Existing methods for assisting collage layout design primarily employ a semantic relevance-first strategy, with arranging related contents together. Though capable, it can not ensure the visual readability of screenshots and may conflict with human natural reading patterns. In this paper, we introduce CollageNoter for real-time collage layout design that adapts to various devices (e.g. laptop, tablet, phone, etc.), offering users with visually and cognitively well-organized screenshot-based E-Notes. Specifically, we construct a novel two-stage pipeline for collage design, including 1) readability-first layout generation and 2) cognitive-driven layout adjustment. In addition, to achieve real-time response and adaptive model training, we propose a cascade transformer-based layout generator named CollageFormer and a size-aware collage layout builder for automatic dataset construction. Extensive experimental results have confirmed the effectiveness of our CollageNoter.
Loading