RD-FGM: A novel model for high-quality and diverse food image generation and ingredient classification
Abstract: Highlights•We develop RD-FGM method, optimizing food generation and multi-modal alignment of recipes and images.•We introduce RecipeCLIP that aligns features from images and recipes for comprehensive ingredient embedding.•We devise a guided attention mechanism for multi-modal diffusion, controlling generation with U-Net transformers.•Validating RD-FGM’s efficiency and downstream task scalability across multiple datasets, achieving optimal performance.
Loading