Multimodal AutoML for Image, Text and Tabular Data

Nick Erickson, Xingjian Shi, James Sharpnack, Alexander J. Smola

2022 (modified: 18 Apr 2023)KDD 2022Readers: Everyone

Abstract: Automated machine learning (AutoML) offers the promise of translating raw data into accurate predictions without the need for significant human effort, expertise, and manual experimentation. In this lecture-style tutorial, we demonstrate fundamental techniques that powers up multimodal AutoML. Different from most AutoML systems that focus on solving tabular tasks that contain categorical and numerical features, we consider supervised learning tasks on various types of data including tabular features, text, and image, as well as their combinations. Rather than technical descriptions of how individual ML models work, we emphasize how to best use models within an overall ML pipeline that takes in raw training data and outputs predictions for test data. A major focus of our tutorial is on automatically building and training deep learning models, which are powerful yet cumbersome to manage manually. Hardly any educational material describes their successful automation. Each topic covered in the tutorial is accompanied by a hands-on Jupyter notebook that implements best practices (which will be available on GitHub before and after the tutorial). Most of the code is adopted from AutoGluon (https://auto.gluon.ai/), a recent open-source AutoML toolkit that is both state-of-the-art and easy-to-use.

0 Replies