A Universal Formulation of Sequential Patterns

Mahesh Joshi, George Karypis, and Vipin Kumar
UMN CS 99-021, 1999
Download Paper
Abstract
This report outlines a more general formulation of sequential patterns, which unifies the generalized patterns proposed by Srikant and Agarwal [SA96] and episode discovery approach taken by Manilla et al [MTV97]. We show that just by varying the values of timing constraint parameters and counting methods, our formulation can be made identical to either one of these. Furthermore, our approach defines several other counting methods which could be suitable for various applications. The algorithm used to discover these universal sequential patterns is based on a modification of the GSP algorithm proposed in [SA96]. Some of these modifications are made to take care of the newly introduced timing constraints and pattern restrictions, whereas some modifications are made for performance reasons. In the end, we present an application, which illustrates the deficiencies of current approaches that can be overcome by the proposed universal formulation.
Research topics: Data mining | Pattern discovery