SINGLE VIEW RECONSTRUCTION FOR FOOD PORTION ESTIMATION
3D scene reconstruction based on single-view images is an ill-posed problem since most 3D information has been lost during the projection process from the 3D world coordinates to the 2D pixel coordinates. To estimate the portion of an object from a single-view requires either the use of priori information such as the geometric shape of the object, or training based techniques that learn from existing portion sizes distribution. In this thesis, we present a single-view based technique for food portion size estimation.
Dietary assessment, the process of determining what someone eats during the course of a day, provides valuable insights for mounting intervention programs for prevention of many chronic diseases such as cancer, diabetes and heart diseases. Measuring accurate dietary intake is considered to be an open research problem in the nutrition and health fields. We have developed a mobile dietary assessment system, the Technology Assisted Dietary AssessmentTM (TADATM) system to automatically determine the food types and energy consumed by a user using image analysis techniques.
In this thesis we focus on the use of a single image for food portion size estimation to reduce a user’s burden from having to take multiple images of their meal. We define portion size estimation as the process of determining how much food (or food energy/nutrient) is present in the food image. In addition to estimating food energy/nutrient, food portion estimation could also be estimating food volumes (in cm3) or weights (in grams), as they are directly related to food energy/nutrient. Food portion estimation is a challenging problem as food preparation and consumption process can pose large variations in food shapes and appearances.
As single-view based 3D reconstruction is in general an ill-posed problem, we investigate the use of geometric models such as the shape of a container that can help to partially recover 3D parameters of food items in the scene. We compare the performance of portion estimation technique based on 3D geometric models to techniques using depth maps. We have shown that more accurate estimation can be obtained by using geometric models for objects whose 3D shape are well defined. To further improve the food estimation accuracy we investigate the use of food portions co-occurrence patterns. The food portion co-occurrence patterns can be estimated from food image dataset we collected from dietary studies using the mobile Food RecordTM (mFRTM) system we developed. Co-occurrence patterns is used as prior knowledge to refine portion estimation results. We have been shown that the portion estimation accuracy has been improved when incorporating the co-occurrence patterns as contextual information.
In addition to food portion estimation techniques that are based on geometric models, we also investigate the use deep learning approach. In the geometric model based approach, we have focused on estimation food volumes. However, food volumes are not the final results that directly show food energy/nutrient consumed. Therefore, instead of developing food portion estimation techniques that lead to an intermediate results (food volumes), we present a food portion estimation method to directly estimate food energy (kilocalories) from food images using Generative Adversarial Networks (GANs). We introduce the concept of an “energy distribution” for each food image. To train the GAN, we design a food image dataset based on ground truth food labels and segmentation masks for each food image as well as energy information associated with the food image. Our goal is to learn the mapping of the food image to the food energy. We then estimate food energy based on the estimated energy distribution image. Based on the estimated energy distribution image, we use a Convolutional Neural Networks (CNN) to estimate the numeric values of food energy presented in the eating scene.