fRMSDPred: Predicting local rmsd between structural fragments using sequence information

Huzefa Rangwala and George Karypis
Computational Systems Biology (CSB), pp. 311-322, 2007
Download Paper
The effectiveness of comparative modeling approaches for protein structure prediction can
be substantially improved by incorporating predicted structural information in the initial
sequence-structure alignment. Motivated by the approaches used to align protein
structures, this paper focuses on developing machine learning approaches for estimating
the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values
can be used to construct the alignment, assess the quality of an alignment, and identify
high-quality alignment segments.

We present algorithms to solve this fragment-level RMSD prediction problem using a
supervised learning framework based on support vector regression and classification that
incorporates protein profiles, predicted secondary structure, effective information encoding
schemes, and novel second-order pairwise exponential kernel functions. Our comprehensive
empirical study shows superior results compared to the profile-to-profile scoring schemes.

Research topics: Bioinformatics | Classification | Data mining | Protein Structure