Abstract: Maximal clique, the most cohesive structure in a graph, has a broad range of applications, e.g., community detection, bioinformatics, anomaly detection, and graph visualization. However, the sheer number of maximal cliques brings the challenge to fully examine them all. In addition, the omnipresent overlaps between cliques imply that it may not be necessary to process every maximal clique, since many vertices are shared in multiple cliques. A real example is that, in commercial advertising, a small group of individuals who participate in different communities can help spread an advertisement across all the communities. Inspired by this observation, we study the problem of finding a τ-cover, which is a subset of vertices in a graph. This subset overlaps with each maximal clique by no less than τ, where τ is a threshold reflecting the user's requirement. We prove the NP-hardness and the non-submodularity of finding a minimum τ-cover. As a result, to find a small τ-cover as best effort, we propose three methods: MCCb, MCC, and EMCC. MCCb is a baseline that adds vertices into the cover while doing clique enumeration until the coverage requirement is satisfied. MCC decides whether to add a vertex with more caution by evaluating the increment of coverage lower bound with O(1) time complexity. EMCC is a randomized algorithm built on an elegant adaptive sampling, which further achieves cover conciseness by relaxing the coverage requirement in a statistical manner. Extensive experiments show that MCC (1.3 ∽ 2.5 × faster) produces a cover whose size is 1/2 of MCCb, and EMCC (2 ∽ 5 × faster) averagely produces a cover whose size is one order of magnitude smaller vs. MCCb.
0 Replies
Loading