Keywords: deep learning, data augmentation, automated data augmentation, latent space
Abstract: Data augmentation is an efficient way to expand a training dataset by creating additional artificial data. While data augmentation is found to be effective in improving the generalization capabilities of models for various machine learning tasks, the underlying augmentation methods are usually manually designed and carefully evaluated for each data modality separately, like image processing functions for image data and word-replacing rules for text data. In this work, we propose an automated data augmentation approach called MODALS (Modality-agnostic Automated Data Augmentation in the Latent Space) to augment data for any modality in a generic way. MODALS exploits automated data augmentation to fine-tune four universal data transformation operations in the latent space to adapt the transform to data of different modalities. Through comprehensive experiments, we demonstrate the effectiveness of MODALS on multiple datasets for text, tabular, time-series and image modalities.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: MODALS is an automated data augmentation framework that fine-tunes four universal data transformation operations in the latent space to augment data of different modalities.
Code: [![github](/images/github_icon.svg) jamestszhim/modals](https://github.com/jamestszhim/modals)
Data: [HAR](https://paperswithcode.com/dataset/har), [SST](https://paperswithcode.com/dataset/sst)