Towards Deep Viticultural Representations: Joint Region and Grape Variety Embeddings

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: self-supervised representation learning, joint embeddings, joint variational autoencoders, viticulture, grape, wine
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Introducing self-supervised learning of grape varieties and grape growing regions using joint variational autoencoders.
Abstract: The creation of embeddings, representations, or features for abstract or non-numeric variables is a prerequisite to utilize these variables in machine learning models; this is also the case for viticulture (growing grapes for wine). Viticultural regions and grape varieties are variables for which deep representations are currently not available. Regions are somewhat definable by their approximate longitude and latitude, average elevation, or averages of climate variables. Each of these ’raw’ features contributes valuable information about the region but it does not easily define a metric for agro-ecological proximity between regions. Grape varieties have much fewer ’raw’ features; one example may be their genetic markers, which, however, are still categorical in nature. Analysis of lineage is possible but does not necessarily provide useful features to the viticulturists as grape attributes are not necessarily inferable by their lineage such as dominant wine style or suitability for a particular region. Therefore, here we present a self-supervised approach to learning joint regional and varietal embeddings using joint variational autoencoder (VAE) networks. This is based on the assumption that regions that grow similar proportions of similar grape varieties are more similar to each other than those that do not, or that grape varieties that often occur together may have similar viticultural characteristics (e.g. climate requirements, aromas, disease resistance). We thereby overcome the lack of detailed data and create deep embeddings for 1557 grape varieties (e.g. Merlot, Riesling, Chardonnay etc.) and 595 viticulturally important regions (e.g. Piemonte, Bourgogne, Mosel etc.). We examine the embeddings, their usability for downstream tasks as well as whether the joint autoencoder network may be used as a varietal suitability ranking system. We show our embeddings to outperform ’raw’ features on downstream tasks and results indicating potential of the autoencoder networks as data-based recommender systems. This is also, to our knowledge, the first work to apply joint VAEs to purely categorical data.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8247
Loading