Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function.


Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. The essential function of many proteins that fold into a specific structure is their ability to bind to a ligand, which can be assayed for thousands of mutated variants. However, binding assays do not distinguish whether mutations affect the stability of the binding interface or the overall fold. Here, we introduce a statistical method to infer a detailed energy landscape of how a protein folds and binds to a ligand by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer distinct folding and binding energies for each mutation providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer the folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its nonlinear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.

Submission Details

ID: QbvNjup83

Submitter: Jakub Otwinowski

Submission Date: Jan. 10, 2019, 5:28 a.m.

Version: 1

Publication Details
Otwinowski J,Mol Biol Evol (2018) Biophysical Inference of Epistasis and the Effects of Mutations on Protein Stability and Function. PMID:30085303
Additional Information

Structure view and single mutant data analysis

Study data

No weblogo for data of varying length.
Colors: D E R H K S T N Q A V I L M F Y W C G P

Data Distribution

Studies with similar sequences (approximate matches)

Correlation with other assays (exact sequence matches)

Relevant UniProtKB Entries

Percent Identity Matching Chains Protein Accession Entry Name
100.0 Immunoglobulin G-binding protein G P06654 SPG1_STRSG
100.0 Immunoglobulin G-binding protein G P19909 SPG2_STRSG