Keywords: Active feature acquisition, active sensing, missing data, reinforcement learning, distribution-shift, off-environment policy evaluation, AFA graph, AFAPE, AFAIS, MCAR, MAR, MNAR, causal inference
TL;DR: Evaluation of active feature acquisition methods under missing data corresponds to a distribution shift which requires adjustment
Abstract: Machine learning (ML) methods generally assume the full set of features are available at no cost. If the acquisition of a certain feature is costly at run-time, one might want to balance the acquisition cost and the predictive value of the feature for the ML task. The task of training an AI agent to decide which features are necessary to be acquired is called active feature acquisition (AFA). Current AFA methods, however, are challenged when the AFA agent has to be trained/tested with datasets that contain missing data. We formulate, for the first time, the problem of active feature acquisition performance evaluation (AFAPE) under missing data, i.e. the problem of adjusting for the inevitable missingness distribution shift between train/test time and run-time. We first propose a new causal graph, the AFA graph, that characterizes the AFAPE problem as an intervention on the reinforcement learning environment used to train AFA agents. Here, we discuss that for handling missing data in AFAPE, the conventional approaches (off-policy reinforcement learning, blocked feature acquisitions, imputation and inverse probability weighting (IPW)) often lead to biased results or are data inefficient. We then propose active feature acquisition importance sampling (AFAIS), a novel estimator that is more data efficient than IPW. We demonstrate the detrimental conclusions to which biased estimators can lead as well as the high data efficiency of AFAIS in multiple experiments using simulated and real-world data under induced MCAR, MAR and MNAR missingness.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Reinforcement Learning (eg, decision and control, planning, hierarchical RL, robotics)