PreVolE: A Robust Data-Driven Framework for Text-Guided Food Volume Estimation

Umair Haroon, Ahmad AlMughrabi, Ricardo Marques, Petia Radeva

Published: 2026, Last Modified: 11 May 2026VISAPP (3) 2026EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Accurate food volume estimation is crucial for medical nutrition management and health monitoring. However, achieving precise 3D reconstruction is difficult due to noisy images that lead to inaccurate point clouds and geometries. To address these challenges, we introduce PreVolE, a robust, data-driven pipeline that includes a novel preprocessing stage to eliminate defocus-blurred and near-duplicate images, resulting in a clearer dataset. To diminish pose ambiguity and enhance the quality of pose estimation, we leverage deep feature extraction and matching within a hierarchical localisation framework to generate more reliable and comprehensive point clouds. Our framework utilises refined point clouds and text-guided segmentation for accurate 3D mesh reconstruction. Experiments show our framework outperforms state-of-the-art methods in reconstruction fidelity and volume accuracy, reducing MAPE from 2.82% to 2.52% on MTF (absolute improvement of 0.3%) and enhancing computational efficie

External IDs:dblp:conf/visapp/HaroonAMR26