Abstract: Given a data matrix, unsupervised column subset selection
refers to the problem of identifying a subset of columns that
can be used to linearly approximate the original data matrix.
This problem has many applications, such as feature selection
and representative selection, but solving it optimally is known
to be NP-hard. We consider multi-view unsupervised column
subset selection, which extends the concept of (single-view)
column subset selection to data represented in multiple views
or modalities. We introduce a combinatorial search algorithm
for this generalized problem. One variant of the algorithm is
guaranteed to compute an optimal solution in a setting similar
to the classical A∗ algorithm. Other suboptimal variants, in a
setting similar to the weighted A∗ algorithm, are much faster
and provide a solution along with a bound on its quality.
Loading