What is Real Anymore? An AI/ML Image Dataset Using Authenticity Validation and Traceable Origins for Every Data Instance
This project addresses the increasing challenge of detecting AI-generated images by creating a novel dataset titled “What Is Real Anymore?” (WIRA). WIRA comprises two subsets: the first includes over 2000 images, validated as authentically real by a set criterion and sourced from photographs on Flickr. The second subset consists of hyper-realistic AI-generated counterparts for each validated Flickr image, aggregated through the Leonardo.AI commercial API. All Flickr-validated images in WIRA are credited to their respective photographers and retain their associated rights. Commercial use of this dataset requires permission from the photographers or adherence to the copyright laws of each validated Flickr image used. This document details the rationale for image authentication, image categories, the motive for category selection, authenticity validation criterion, methodology for the creation of the dataset, the computational resources used, a review of included and excluded decision records, and potential enhancements to expand WIRA.