Enhancing Human Body Generation in Diffusion Models with Dual-Level Prior Knowledge

ICLR 2025 Conference Submission2091 Authors

20 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: diffusion models, human body generation
Abstract: The development of diffusion models (DMs) has greatly enhanced text-to-image generation, outperforming previous methods like generative adversarial networks (GANs) in terms of image quality and text alignment. However, accurately generating human body images remains challenging, often resulting in disproportionate figures and anatomical errors, which limits their practical applications in areas such as portrait generation. While previous methods such as HcP have shown promising results, limitations including wrongly kept prior, insufficient human-related knowledge, and limited generalization ability still exist due to the specific design of fully-supervised learning with only pose-related information. In this study, we introduce a novel method to enhance pretrained diffusion models for realistic human body generation by incorporating dual-level human prior knowledge. Our approach involves learning shape-level details with the human-related tokens in the original prompts, and learning pose-level prior by adding a learnable pose-aware token to each text prompt. We use a two-stage training strategy to rectify the cross attentions with a bind-then-generalize process, leveraging multiple novel objectives along with adversarial training. Our extensive experiments show that this method significantly improves the ability of SD1.5 and SDXL pretrained models to generate human bodies, reducing deformities and enhancing practical utility.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2091
Loading