Abstract: MicroCluster can mine different types of arbitrarily positioned and overlapping clusters of genetic data to find interesting patterns. Our approach has four key features. First, we mine only the maximal biclusters satisfying certain homogeneity criteria. Second, the clusters can be arbitrarily positioned anywhere in the input data matrix, and they can have arbitrary overlapping regions. Third, MicroCluster uses a flexible definition of a cluster that lets it mine several types of biclusters (which previously were studied independently). Finally, MicroCluster can delete or merge biclusters that have large overlaps. So, it can tolerate some noise in the data set and let users focus on the most important clusters. We've developed a set of metrics to evaluate the clustering quality and have tested MicroCluster's effectiveness on several synthetic and real data sets.
Loading