DOI: 10.64028/gimd125385
Keywords: predictive capacity, ignorability, identification bounds
Abstract: Tests play a central role in psychology and education, particularly in selection processes such as university admissions, where their predictive capacity—how well they forecast future performance—is typically assessed using regression models. However, a critical challenge arises in such analyses: while test scores are available for all applicants, outcome data (e.g., grade point averages) are only observed for selected individuals. Consequently, neither the regression model nor its parameters are identifiable from the sampling process.
Traditional approaches to address this problem rely on strong, often unrealistic assumptions about the missing data mechanism. Recent work has turned to partial identification, which provides bounds for regression functions and their parameters rather than point estimates. Building on Stoye (2007), this paper advances this approach by applying narrower and more informative identification bounds. Using real data from the Chilean university admissions system and contextually grounded assumptions, we demonstrate how these refined bounds enable modeling of selection process complexities. Our results underscore the value of partial identification in enhancing predictive capacity studies, offering a data-driven methodology to better understand the relationship between test scores and performance in selection settings.
Submission Number: 19
Loading