Abstract: Images have become the most popular type of multimedia in the Big Data era. Widely used applications like CBIR underscore the importance of image understanding, especially in terms of semantically meaningful information. Typically, high dimensional image descriptors are embedded to a subspace using a simple linear projection. However, semantic information has a complex distribution in feature space that requires a non-linear projection. We first estimate an intrinsic dimensionality of image data. Next we build a measure of visual information in embedded subspace. We compare several linear and non-linear projection methods. We use multiple image databases towards a comprehensive evaluation in terms of information content, consequent recognition rates, and computational cost. This paper is relevant for researchers interested in dimensionality reduction for large scale image understanding that preserves semantically relevant information.
Loading