SkipSR: Faster Super-Resolution with Token Skipping

ICLR 2026 Conference Submission14048 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: video generation, efficient transformers, super-resolution, diffusion
TL;DR: We accelerate video super-resolution by identifying complex regions and only applying sparse attention to them instead of uniformly processing the entire input.
Abstract: Diffusion-based super-resolution (SR) is a key component in video generation and video restoration, but is slow and expensive, limiting scalability to higher resolutions and longer videos. Our key insight is that many regions in video are inherently low-detail and gain little from refinement, yet current methods process all pixels uniformly. To take advantage of this, we propose SkipSR, a simple framework for accelerating video SR by identifying low-detail regions directly from low-resolution input, then skipping computation on them entirely, only super-resolving the areas that require refinement. This simple yet effective strategy preserves perceptual quality in both standard and one-step diffusion SR models while significantly reducing computation. In standard SR benchmarks, our method achieves up to 60% faster end-to-end latency than prior models on 720p videos with no perceptible loss in quality. Video demos are available at our anonymous project page.
Supplementary Material: pdf
Primary Area: generative models
Submission Number: 14048
Loading