Abstract: In various application domains (e.g., health psychology), experts use Bayesian networks to represent relationships among variables. However, these variables are not, in practice, directly observable but can be instead inferred via noisy but costly features. Herein, we study the problem of datum-wise feature selection and classification in the case where the label of each data instance is described by a known Bayesian network, and features are available at a cost. The goal is to accurately classify each data instance, while keeping the feature acquisition cost minimum. To this end, we first propose a forward pass algorithm that sequentially acquires features to infer the label of each variable in the Bayesian network. During this process, the proposed algorithm uses both the acquired features and the Bayesian network relationships. In an effort to improve classification accuracy, we also devise a backward pass algorithm, which exploits Bayesian network relationships along with evidence. We discuss the computational complexity of both the algorithms and experimentally assess their performance on 11 datasets. We observe that the forward pass algorithm achieves higher accuracy using a small fraction of features compared to state of the art, while the backward pass algorithm enhances accuracy without acquiring additional features.
Loading