Learning Tabular Embeddings at Web ScaleDownload PDFOpen Website

Published: 01 Jan 2021, Last Modified: 05 Oct 2023IEEE BigData 2021Readers: Everyone
Abstract: Contextual embeddings, such as ELMo and BERT [21], [49], assign each word a representation based on its context [44]. This research builds on an observation that the context for structured data can be encoded very differently from the traditional sentence and text-based context. This means that embeddings for the structured data can be constructed on different principles, and if properly optimized, they can be used to improve performance ML and AI-based tasks related to the structured data. Here we present several new types of tabular embeddings taking into consideration structure of columns, rows, and presence of metadata. We demonstrated that properly optimized embeddings in combination with ML and DL models show significant improvement on such important tasks as tabular column and tuple recognition.
0 Replies

Loading