Out-of-Core Coherent Closed Quasi-Clique Mining from Large Dense Graph Databases

Zhiping Zeng, Jianyong Wang, Lizhu Zhou, and George Karypis
ACM Transactions on Database Systems, Vol 32, Issue 2, 2007
Download Paper
Due to the ability of graphs to represent more generic and more complicated relationships among different objects, graph mining has played a significant role in data mining, attracting increasing attention in the data mining community. In addition, frequent coherent subgraphs can provide valuable knowledge about the underlying internal structure of a graph database, and mining frequently occurring coherent subgraphs from large dense graph databases has witnessed several applications and received considerable attention in the graph mining community recently. In this article, we study how to efficiently mine the complete set of coherent closed quasi-cliques from large dense graph databases, which is an especially challenging task due to the fact that the downward-closure property no longer holds. By fully exploring some properties of quasi-cliques, we propose several novel optimization techniques which can prune the unpromising and redundant subsearch spaces effectively. Meanwhile, we devise an efficient closure checking scheme to facilitate the discovery of closed quasi-cliques only. Since large databases cannot be held in main memory, we also design an out-of-core solution with efficient index structures for mining coherent closed quasi-cliques from large dense graph databases.We call this Cocain*. Thorough performance study shows that Cocain* is very efficient and scalable for large dense graph databases.
Research topics: Data mining | Graph mining