Heidi matrix: nearest neighbor driven high dimensional data visualizationOpen Website

2009 (modified: 12 Nov 2022)KDD Workshop on Visual Analytics and Knowledge Discovery 2009Readers: Everyone
Abstract: Identifying patterns in large high dimensional data sets is a challenge. As the number of dimensions increases, the patterns in the data sets tend to be more prominent in the subspaces than the original dimensional space. A system to facilitate presentation of such subspace oriented patterns in high dimensional data sets is required to understand the data. Heidi is a high dimensional data visualization system that captures and visualizes the closeness of points across various subspaces of the dimensions; thus, helping to understand the data. The core concept behind Heidi is based on prominence of patterns within the nearest neighbor relations between pairs of points across the subspaces. Given a d-dimensional data set as input, Heidi system generates a 2-D matrix represented as a color image. This representation gives insight into (i) how the clusters are placed with respect to each other, (ii) characteristics of placement of points within a cluster in all the subspaces and (iii) characteristics of overlapping clusters in various subspaces. A sample of results displayed and discussed in this paper illustrate how Heidi Visualization can be interpreted.
0 Replies

Loading