Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis. | OpenReview

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis.

Bingda Tang, Boyang Zheng, Xichen Pan, Sayak Paul, Saining Xie

12 Nov 2025CoRR 2025EveryoneCC BY-SA 4.0

Loading