Abstract: This paper aims at discovering meaningful subsets of related images
from large image collections without annotations. We search groups
of images related at dierent levels of semantic, i.e., either instances
or visual classes. While k-means is usually considered as the gold
standard for this task, we evaluate and show the interest of diusion
methods that have been neglected by the state of the art, such as
the Markov Clustering algorithm.
We report results on the ImageNet and the Paris500k instance
dataset, both enlarged with images from YFCC100M. We evaluate
our methods with a labelling cost that reects how much eort a
human would require to correct generated clusters.
Our analysis highlights several properties. First, when powered
with an ecient GPU implementation, the cost of the discovery
process is small compared to computing the image descriptors, even
for collections as large as 100 million images. Second, we show that
descriptions selected for instance search improve the discovery of
object classes. Third, the Markov Clustering technique consistently
outperforms other methods; to our knowledge it has never been
considered in this large scale scenario.
0 Replies
Loading