Squeezing Large-Scale Diffusion Models for Mobile

Jiwoong Choi; Minkyu Kim; Daehyun Ahn; Taesu Kim; Yulhwa Kim; Dongwon Jo; Hyesung Jeon; jae-joon kim; Hyungjun Kim

Squeezing Large-Scale Diffusion Models for Mobile

Jiwoong Choi, Minkyu Kim, Daehyun Ahn, Taesu Kim, Yulhwa Kim, Dongwon Jo, Hyesung Jeon, jae-joon kim, Hyungjun Kim

Published: 23 Jun 2023, Last Modified: 03 Jul 2023DeployableGenerativeAIEveryoneRevisions

Keywords: On-device, Diffusion model, optimization, mobile GPU

TL;DR: We discuss challenges and solutions in deploying Stable Diffusion v2.1 to Tensorflow Lite framework for on-device inference.

Abstract: The emergence of diffusion models has greatly broadened the scope of high-fidelity image synthesis, resulting in notable advancements in both practical implementation and academic research. With the active adoption of the model in various real-world applications, the need for on-device deployment has grown considerably. However, deploying large diffusion models such as Stable Diffusion with more than one billion parameters to mobile devices poses distinctive challenges due to the limited computational and memory resources, which may vary according to the device. In this paper, we present the challenges and solutions for deploying Stable Diffusion on mobile devices with TensorFlow Lite framework, which supports both iOS and Android devices. The resulting Mobile Stable Diffusion achieves the inference latency of smaller than 7 seconds for a 512 $\times$ 512 image generation on Android devices with mobile GPUs.

Submission Number: 30

Loading