Improved estimation of structure predictor quality

Kevin W. Deronne and George Karypis
BMC Structural Biology, 9:41, 2009
Download Paper
Abstract
Background: Methods that can automatically assess the quality of computationally predicted protein structures are important, as they enable the selection of the most accurate structure from an ensemble of predictions. Assessment methods that determine the quality of a predicted structure by comparing it against the various structures predicted by different servers have been shown to outperform approaches that rely on the intrinsic characteristics of the structure itself.

Results: We examined techniques to estimate the quality of a predicted protein structure based on prediction consensus. LGA is used to align the structure in question to the structures for the same protein predicted by different servers. We examine both static (e.g. averaging) and dynamic (e.g. support vector machine) methods for aggregating these distances on two datasets.

Conclusion: We find that a constrained regression approach shows consistently good performance. Although it is not always the absolute best performing scheme, it is always performs on par with the best schemes across multiple datasets. The work presented here provides the basis for the construction of a regression model trained on data from existing structure prediction servers.

Research topics: Bioinformatics