pos_embed/PatchEmbed: 0%
transformer_blocks: 10.15 GMACs = 3.55% MACs, 20.31 GFLOPS = 1.78% FLOPs
    (norm1): LayerNorm(0 MACs = 0% MACs, 2.36 MFLOPS = 0% FLOPs
    (attn1): Attention 2.72 GMACs = 0.95% MACs, 5.44 GFLOPS = 0.48% FLOPs
    (norm2): LayerNorm(0 MACs = 0% MACs, 2.36 MFLOPS = 0% FLOPs
    (attn2): Attention(2 GMACs = 0.7% MACs, 3.99 GFLOPS = 0.35% FLOPs
    (ff): FeedForward(5.44 GMACs = 1.9% MACs, 10.87 GFLOPS = 0.95% FLOPs

(norm_out): LayerNorm(0 MACs = 0% MACs
(proj_out): Linear(MMACs = 0.01% MACs
(adaln_single): AdaLayerNormSingle(19.17 MMACs = 0.01% MACs
(caption_projection): PixArtAlphaTextProjection(1.45 GMACs = 0.51% MACs
