Learning Tree-Structured Composition of Data Augmentation

TMLR Paper2396 Authors

20 Mar 2024 (modified: 24 Apr 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Data augmentation is widely used in settings where one needs to learn a neural network given very little labeled data. Typically, a composition of several transformations is applied sequentially to transform a given sample. Existing approaches for finding a composition either rely on domain expertise, or involve solving a complex optimization problem. The key challenge is that for finding a composition of length $d$, the search space is $k^d$, given a list of $k$ transformation functions. In this paper, we focus on designing efficient algorithms whose running time complexity is much faster. We propose a top-down recursive algorithm to search inside the space of tree-structured composition (of the $k$ transformations), where each tree node corresponds to one transformation. The tree structure can be viewed as a generalization of existing augmentation methods, such as the one constructed by SimCLR (Chen et al., 2020). Our algorithm runs in time $O(2^d k)$, which is much faster than the worst-case complexity of $O(k^d)$ (as soon as $k$ grows away from 2). We extend the algorithm to tackle data distributions with heterogeneous subpopulations by finding one tree for each subpopulation and then learning a weighted combination of the trees. We validate the proposed algorithms on several graph and image data sets, including a multi-label graph classification data set we collected. The dataset exhibits significant variations in the sizes of graphs and their average degrees, making it ideal for studying data augmentation. On the graph classification data set, our proposed algorithms can reduce computation by 43% over several recent augmentation search methods while improving performance by 4.3%. Besides, extensive experiments in contrastive learning also validate the benefit of our algorithm. The tree structures allow one to interpret the relative role of each augmentation, for example, identifying the important transformations on small vs. large graphs.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Olgica_Milenkovic1
Submission Number: 2396
Loading