Minimizing data consumption with sequential online feature selection

Thomas Rückstieß, Christian Osendorfer, Patrick van der Smagt

2013 (modified: 08 Nov 2022)Int. J. Machine Learning & Cybernetics 2013Readers: Everyone

Abstract: In most real-world information processing problems, data is not a free resource. Its acquisition is often expensive and time-consuming. We investigate how such cost factors can be included in supervised classification tasks by deriving classification as a sequential decision process and making it accessible to reinforcement learning. Depending on previously selected features and the internal belief of the classifier, a next feature is chosen by a sequential online feature selection that learns which features are most informative at each time step. Experiments on toy datasets and a handwritten digits classification task show significant reduction in required data for correct classification, while a medical diabetes prediction task illustrates variable feature cost minimization as a further property of our algorithm.

0 Replies