Prediction of Contact Maps Using Support Vector Machines

Ying Zhao and George Karypis
3rd IEEE International Conference on Bioinformatics and Bioengineering (BIBE), pp. 26 - 33, 2003
Download Paper
Abstract
Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence profiles and their conservation, correlated mutation analysis based on various amino acid physicochemical properties, and secondary structure. In addition, we evaluated the effectiveness of different features on contact map prediction for different fold classes. On average, our predictor achieved a prediction accuracy of 0.2238 with an improvement over a random predictor of a factor 11.7, which is better than reported studies. Our study showed that predicted secondary structure features play an important roles for the proteins containing beta structures. Models based on secondary structure features and CMA features produce different sets of predictions. Our study also suggests that models learned separately for different protein fold families may achieve better performance than a unified model.
Research topics: Bioinformatics | Classification | Data mining | Protein structure prediction