Abstract: Nowadays, cooking recipe sharing sites on the Web are widely used,
and play a major role in everyday home cooking. Since cooking
recipes consist of dish photos and recipe texts, cross-modal recipe
search is being actively explored. To enable cross-modal search,
both food image features and cooking text recipe features are embedded into the same shared space in general. However, in most of
the existing studies, a one-to-one correspondence between a recipe
text and a dish image in the embedding space is assumed, although
an unlimited number of photos with different serving styles and
different plates can be associated with the same recipe.
In this paper, we propose a RDE-GAN (Recipe Disentangled
Embedding GAN) which separates food image information into a
recipe image feature and a non-recipe shape feature. In addition,
we generate a food image by integrating both the recipe embedding
and a shape feature. Since the proposed embedding is free from
serving and plate styles which are unrelated to cooking recipes,
the experimental results showed that it outperformed the existing
methods on cross-modal recipe search. We also confirmed that only
either shape or recipe elements can be changed at the time of food
image generation.
0 Replies
Loading