Task-Guided Biased Diffusion Models for Point Localization

19 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Diffusion models; crowd localization; cell localization; human pose estimation
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A task-guided biased diffusion model is proposed for point localization tasks including crowd localization, human pose estimation, and cell localization.
Abstract: We hypothesize that diffusion models can be used to enhance the performance of deep learning methods for predictive tasks involving sparse outputs, such as point-localization tasks. However, this has two difficulties: slow inference and the stochastic nature of sampling, which leads to variable predictions for different initialization seeds of the sampling chain. To improve inference efficiency, we propose the introduction of task bias in the forward diffusion process, replacing the standard convergence to zero-mean Gaussian noise by convergence to a noise distribution closer to that of the target sparse point localization data. This simplifies the reverse diffusion process and is shown to decrease the number of necessary denoising steps, while improving prediction quality. To decrease prediction variance due to seed stochasticity, we propose a task-guided loss that is shown to decrease the average distance between predictions from different noise realizations. The two contributions are combined into the Task-Guided Biased Diffusion Model (TGBDM), which maps an initial prediction from a classical localization method into a refined localization map. This is shown to achieve state-of-the-art performance for crowd localization, pose estimation, and cell localization.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2090
Loading