Abstract:
Missense mutations may cause structural alterations of a protein and lead to a loss/gain of
activity and stability. Studies of missense mutations in proteins are important for
understanding protein structure-function relationships, analyzing the function of gene
variations, and designing new proteins. In this dissertation, we have developed
computational mutagenesis models to predict the changes of stability and activity of
protein mutants using the four-body statistical potential derived from Delaunay
tessellations of protein structures. First, our results show that a strong correlation exists
between the mean residual scores of mutants and the change of mutant stability in 18
proteins extracted from ProTherm database. Second, we developed robust and accurate
machine-learning models based on the residual score profiles of protein mutants to
predict the sign of mutant stability change. Third, we have demonstrated a correlation
between changes of four-body statistical potential and activity alternation in human p53
and rabbit sarcoendo plasmic reticulum calcium-ATPases (SERCA1) mutants. The
supervised machine-learning models based on the residual score profiles of protein
mutants were also developed to predict the activity changes in p53 and SERCA1 mutants.
Fourth, a highly significant correlation between changes in four-body statistical potential
with conservation of amino-acid substitutions was observed. Finally, a novel statistical
matrix based on the mean residual scores of all 380 types of mutations in 700 proteins
was developed and a statistically significant correlation is revealed between the novel
matrix and PAM/BLOSUM matrices. Overall, these conclusions support our hypothesis
that computational mutagenesis models using four-body statistical potential present a
powerful approach for predicting the changes of activity and stability in protein mutants.