Automated Approaches for Classifying Structures

Mukund Deshpande, Michihiro Kuramochi, and George Karypis
SIGKDD Workshop on Bioinformatics, BIOKDD, 2002
In this paper we study the problem of classifying chemical compound datasets. We present an algorithm that first mines the chemical compound dataset to discover discriminating sub-structures; these discriminating sub-structures are used as features to build a powerful classifier. The advantage of our classification technique is that it requires very little domain knowledge and can easily handle large chemical datasets. We evaluated the performance of our classifier on two widely available chemical compound datasets and have found it to give good results.
Research topics: Bioinformatics | Cheminformatics | Classification | Data mining