Finding Frequent Patterns in a Large Sparse Graph

Michihiro Kuramochi and George Karypis
SIAM International Conference on Data Mining, 2004
Download Paper
This paper presents two algorithms based on the horizontal and vertical pattern discovery paradigms that find the connected subgraphs that have a sufficient number of edge-disjoint embeddings in a single large undirected labeled sparse graph. These algorithms use three different methods to determine the number of the edge-disjoint embeddings of a subgraph that are based on approximate and exact maximum independent set computations and use it to prune infrequent subgraphs. Experimental evaluation on real datasets from various domains show that both algorithms achieve good performance, scale well to sparse input graphs with more than 100,000 vertices and around 200,000 edges, and significantly outperform previously developed algorithms.
Research topics: Data mining | Graph mining | Pattern discovery