MIND dataset for diet planning and dietary healthcare with machine learning: Dataset creation using combinatorial optimization and controllable generation with domain expertsDownload PDF

Aug 21, 2021 (edited Sep 30, 2021)NeurIPS 2021 Datasets and Benchmarks Track (Round 2)Readers: Everyone
  • Keywords: diet planning, healthcare, machine learning, dataset, MIND, dietkit
  • TL;DR: MIND for diet planning and dietary healthcare with machine learning
  • Abstract: Diet planning, a basic and regular human activity, is important to all individuals, from children to seniors and from healthy people to patients. Many recent attempts have been made to develop machine learning (ML) applications related to diet planning. However, given the complexity and difficulty of implementing this task, no high-quality diet-level dataset exists at present, even among professionals, such as dietitians and physicians. In this work, we create and publish the Korean Menus–Ingredients–Nutrients–Diets (MIND) dataset for a ML regarding diet planning and dietary health research. The nature of diet planning entails both explicit (nutrition) and implicit (composition) requirements. Thus, the MIND dataset was created by integrating the capabilities of an operations research (OR) model that specifies and applies explicit data requirements for diet solution generation, experts who can consider implicit data requirements to make diets realistic, and a controllable generation machine that automates the high-quality diet generation process. MIND consists of data of 1,500 daily diets, 3,238 menus, 3,036 ingredients, and information about 14 major nutrients for use in South Korean dietary practice. MIND can be easily downloaded and analyzed using the Python package dietkit, which is accessible via the package installer for Python. MIND is expected to contribute to the use of ML in solving medical, economic, and social problems associated with diet planning. Furthermore, our approach of integrating an OR model, experts, and an ML model is expected to promote the use of ML in other cases that require the generation of high-quality synthetic data regarding professional tasks, especially since the use of ML to automate and support professional tasks has become highly valuable.
  • Supplementary Material: zip
  • URL: https://github.com/pki663/dietkit/tree/master/samples
47 Replies