Abstract: Diffusion models have been successfully used in various applications such as text-to-image generation, 3D assets generation, controllable image editing, video generation, natural language generation, audio synthesis, and motion generation. The rate of progress on diffusion models is astonishing. In the year 2022 alone, diffusion models have been applied to many large-scale text-to-image foundation models, such as DALL-E 2 [Ramesh et al. 2022], Imagen [Saharia et al. 2022], Stable Diffusion [Rombach et al. 2022], and eDiff-I [Balaji et al. 2022]; video generation models such as Imagen Video [Ho et al. 2022] and Make-a-video [Singer et al. 2022]; 3D asset generation models such as Magic3D [Lin et al. 2022] and DreamFusion [Poole et al. 2022]. This course covers the advances in diffusion models over the last few years and will be tailored to the computer graphics community. We will first cover the fundamental machine learning and deep learning techniques relevant to diffusion models. Next, we will present state-of-the-art techniques for the application of diffusion models to high-fidelity image synthesis, controllable image generation, compositional representation learning, and 3D asset generation. Finally, we will conclude with a discussion on the future application of this technology, societal impact and open research problems. After the course, the attendees will learn basic knowledge about diffusion models and how such models can be applied to different applications such as image generation, image editing, and 3D asset generation.
0 Replies
Loading