Topic-Driven Clustering for Document Datasets
Ying Zhao and George Karypis |
SIAM International Conference on Data Mining, pp. 358-369, 2005 |
Download Paper |
Abstract In this paper, we define the problem of topic-driven clustering, which organizes a document collection according to a given set of topics. We propose three topic-driven schemes that consider the similarity between documents and topics and the relationship among documents themselves simultaneously. We present a comprehensive experimental evaluation of the proposed topic-driven schemes on five datasets. Our experimental results show that the proposed topic-driven schemes are efficient and effective with topic prototypes of different levels of specificity. |
Research topics: Classification | Clustering | CLUTO | Data mining | Information retrieval | Text mining |