Keywords: Long-form Generation, Structural Planning, Controllable Generation
Abstract: Generating coherent and controllable long-form content remains a persistent challenge for Large Language Models (LLMs).
While reasoning-enhanced models have demonstrated success in logic-intensive domains, our evaluation reveals that they suffer from a severe length collapse in open-ended writing, where performance degrades sharply as target lengths exceed 2,000 words.
We attribute this failure to the limitation of static hierarchical planning, which struggles to provide dynamic guidance over extended contexts.
To bridge this gap, we introduce the **Interleaved Structural Chain-of-Thought ($\texttt{IS-CoT}$)** framework.
Unlike external agentic workflows, $\texttt{IS-CoT}$ embeds a dynamic $\texttt{Plan-Write-Reflect}$ cycle into the generation process, enabling continuous strategy adaptation and global alignment without additional assistance.
Based on this framework, we construct a high-quality dataset of interleaved reasoning traces via a multi-teacher pipeline and train **IS-Writer-8B**. Experiments demonstrate that IS-Writer-8B achieves state-of-the-art performance on challenging long-form benchmarks (e.g., +3.08 vs. DeepSeek-V3.2 on LongBench-Write), exhibiting robust length compliance and coherence competitive with significantly larger proprietary models.
Paper Type: Long
Research Area: Natural Language Generation
Research Area Keywords: Generation, Language Modeling
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources
Languages Studied: English, Chinese
Submission Number: 7264
Loading