Evaluation of data augmentation techniques on subjective tasks

Luis Gonzalez-Naharro, M. Julia Flores, Jesus Martínez-Gómez, José Miguel Puerta

Published: 2024, Last Modified: 19 Jan 2025Mach. Vis. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Data augmentation is widely applied in various computer vision problems for artificially increasing the size of a dataset by transforming the original data. These techniques are employed in small datasets to prevent overfitting, and also in problems where labelling is difficult. Nevertheless, data augmentation assumes that transformations preserve groundtruth labels, something not true for subjective problems such as aesthetic quality assessment, in which image transformations can alter their aesthetic quality groundtruth. In this work, we study how data augmentation affects subjective problems. We train a series of models, changing the probability of augmenting images and the intensity of such augmentations. We train models on AVA for quality prediction, on Photozilla for photo style prediction, and on subjective and objective labels of CelebA. Results show that subjective tasks get worse results than objective tasks with traditional augmentation techniques, and this worsening depends on the specific type of subjectivity.