Ear-based Person Recognition using Pix2Pix GAN Augmentation

Eman Abdullah M. Alomari, Su Yang, Sanaul Hoque, Farzin Deravi

Published: 2024, Last Modified: 11 Feb 2025BIOSIG 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This study presents a robust framework that leverages advanced deep-learning techniques for ear-based human recognition. Faced with the challenge of dataset sizes, our approach is developed based on a generative adversarial network (GAN) method namely Pix2Pix to augment the dataset. It is demonstrated that this approach offers the ability to produce complementary images for ear recognition. To be more specific, Pix2Pix GAN is employed to generate missing sides in ear image pairs (i.e., creating corresponding left ear images for right ear images and vice versa). As such, this augmentation could substantially increase the dataset size, making it more diverse and of significantly greater use for training purposes. The employed dataset consisted of several images of the right ear and only one left ear for each individual. A series of corresponding synthetic left-ear images is generated using Pix2Pix GAN as a tool for augmenting the available data and mitigate the dataset’s lack of left ear images. The experiment framework used the EarNet model and conducted comparative evaluations before and after Pix2Pix GAN augmentation using the AMI Ear dataset. By employing the Pix2Pix GAN, the proposed approach can effectively double the size of a dataset and, in the process, provide significantly greater utility regarding how that data can be utilised in real-world applications scenarios. The resulting accuracy reaches 98% on the AMI dataset, demonstrating that this technique can improve model performance for ear-based human recognition.