Foodnet: multi-scale and label dependency learning-based multi-task network for food and ingredient recognition

Published: 01 Jan 2024, Last Modified: 14 May 2024Neural Comput. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Image-based food pattern classification poses challenges of non-fixed spatial distribution and ingredient occlusion for mainstream computer vision algorithms. However, most current approaches classify food and ingredients by directly extracting abstract features of the entire image through a convolutional neural network (CNN), ignoring the relationship between food and ingredients and ingredient occlusion problem. To address these issues mentioned, we propose a FoodNet for both food and ingredient recognition, which uses a multi-task structure with a multi-scale relationship learning module (MSRL) and a label dependency learning module (LDL). As ingredients normally co-occur in an image, we present the LDL to use the dependency of ingredient to alleviate the occlusion problem of ingredient. MSRL aggregates multi-scale information of food and ingredients, then uses two relational matrixs to model the food-ingredient matching relationship to obtain richer feature representation. The experimental results show that FoodNet can achieve good performance on the Vireo Food-172 and UEC Food-100 datasets. It is worth noting that it reaches the most state-of-the-art level in terms of ingredient recognition in the Vireo Food-172 and UECFood-100.The source code will be made available at https://github.com/visipaper/FoodNet.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview