Dataset Construction Using Item Response Theory for Educational Machine Learning Competitions

Takeaki Sakabe, Yuko Sakurai, Emiko Tsutsumi, Satoshi Oyama

Published: 01 Jan 2025, Last Modified: 22 Sept 2025IEEE Access 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Machine learning has been integrated into numerous applications and has emerged as one of the most transformative technologies in our daily lives. In recent years, the number of individuals studying machine learning has grown substantially, leading to the emergence of numerous educational competitions focused on building expertise in machine learning. In these competitions, the participants are tasked with constructing machine learning (ML) models. However, the dataset used to compare the performances of competing models is often selected arbitrarily, causing discrepancies between the dataset and participants’ skill levels. This can result in competition outcomes that fail to accurately reflect the participants’ abilities. We have developed a framework for generating image datasets that enable the abilities of competition participants to be accurately assessed. Specifically, we introduce the use of item response theory (IRT), commonly used in test creation and ability assessment, to estimate parameters such as item discrimination and difficulty for each image in existing datasets. Additionally, we utilize a conditional variational autoencoder (CVAE) that generates images with specific parameter values. These parameter values are generated based on the ability distribution of the competition participants and used to generate a dataset aligned with their ability distribution. To evaluate the effectiveness of the proposed framework, we conduct experiments using 810 ML models automatically created using 6 parameters with multiple values. Comparison of their performances between the original and the generated dataset showed that the latter was more effective in differentiating model performance. Unlike conventional IRT-based methods, which require human effort for dataset generation, our proposed framework fully automates the dataset generation process. By automating dataset generation, our approach streamlines the organization of ML competitions and ensures that datasets are well-suited to participants’ skill levels. This automation reduces the challenges of hosting competitions, promoting their broader adoption in educational settings.

External IDs:dblp:journals/access/SakabeSTO25