Keywords: fairness, face anonymisation, computer vision, decentralisation, demographic bias, data collection, face recognition, clustering
TL;DR: Building a decentralised tool for dataset creation and collection of faces for improving fairness and reducing bias in face detection algorithms for face anonymisation use-case
Abstract: Recent developments in machine learning have shown that successful models do not rely only on huge amounts of data but the right kind of data. We show in this paper how this data-centric approach can be facilitated in a decentralised manner to enable efficient data collection for algorithms. Face detectors are a class of models that suffer heavily from bias issues as they have to work on a large variety of different data. We also propose a face detection and anonymisation approach using a hybrid Multi-Task Cascaded CNN with FaceNet Embeddings to benchmark multiple datasets to describe and evaluate the bias in the models towards different ethnicities, gender and age groups along with ways to enrich fairness in a decentralised system of data labelling, correction and verification by users to create a robust pipeline for model retraining.