Abstract: Generating slides from documents is a complex and challenging task that requires balancing content quality, structural coherence, and visual design.
Existing methods primarily focus on directly converting raw text into slides, prioritizing content quality while often neglecting critical aspects such as structural coherence and visual design.
Inspired by human presentation creation processes, we propose PPTAgent, a novel framework that redefines presentation generation as an edit-based process using reference presentations, enabling LLMs to create well-rounded presentations. PPTAgent comprises two stages:
(1) Presentation Analysis, which enhances LLMs' comprehension of the structure and content schemas by analyzing reference presentations.
(2) Presentation Generation, which generates a detailed outline for the document and assigns specific document sections and reference slides to each slide.
%
To comprehensively evaluate the quality of generated presentations, we introduce PPTEval, a comprehensive evaluation framework that assesses presentations across three key dimensions: content, design, and coherence.
%
Experimental results demonstrate that PPTAgent significantly outperforms conventional presentation generation methods across all three key dimensions.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: multimodal applications, code generation and understanding, evaluation methodologies, NLP datasets
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 573
Loading