The Set Classification Problem and Solution Methods

Xia Ning and George Karypis
SIAM Data Mining, pp. 847-858, 2009
Download Paper
Abstract
This paper focuses on developing classification algorithms for problems in which there is a need to predict the class based on multiple observations (examples) of the same phenomenon (class). These problems give rise to a new classification problem, referred to as set classification, that requires the prediction of a set of instances given the prior knowledge that all the instances of the set belong to the same unknown class. This problem falls under the general class of problems whose instances have class label dependencies. Four methods for solving the set classification problem are developed and studied. The first is based on a straightforward extension of the traditional classification paradigm whereas the other three are designed to explicitly take into account the known dependencies among the instances of the unlabeled set during learning or classification. A comprehensive experimental evaluation of the various methods and their underlying parameters shows that some of them lead to significant gains in performance.
Research topics: Classification | Data mining