Quantifying evolutionary importance of protein sites: A Tale of two measures

Autoři: Avital Sharir-Ivry aff001;  Yu Xia aff001
Působiště autorů: Department of Bioengineering, McGill University, Montreal, Quebec, Canada aff001
Vyšlo v časopise: Quantifying evolutionary importance of protein sites: A Tale of two measures. PLoS Genet 17(4): e1009476. doi:10.1371/journal.pgen.1009476
Kategorie: Research Article
doi: 10.1371/journal.pgen.1009476


A key challenge in evolutionary biology is the accurate quantification of selective pressure on proteins and other biological macromolecules at single-site resolution. The evolutionary importance of a protein site under purifying selection is typically measured by the degree of conservation of the protein site itself. A possible alternative measure is the strength of the site-induced conservation gradient in the rest of the protein structure. However, the quantitative relationship between these two measures remains unknown. Here, we show that despite major differences, there is a strong linear relationship between the two measures such that more conserved protein sites also induce stronger conservation gradient in the rest of the protein. This linear relationship is universal as it holds for different types of proteins and functional sites in proteins. Our results show that the strong selective pressure acting on the functional site in general percolates through the rest of the protein via residue-residue contacts. Surprisingly however, catalytic sites in enzymes are the principal exception to this rule. Catalytic sites induce significantly stronger conservation gradients in the rest of the protein than expected from the degree of conservation of the site alone. The unique requirement for the active site to selectively stabilize the transition state of the catalyzed chemical reaction imposes additional selective constraints on the rest of the enzyme.

Klíčová slova:

Enzymes – Evolutionary rate – Fungal evolution – Molecular evolution – Protein-protein interactions – Saccharomyces cerevisiae – Schizosaccharomyces pombe – Yeast


1. Worth CL, Gong S, Blundell TL. Structural and functional constraints in the evolution of protein families. Nat Rev Mol Cell Biol. 2009;10: 709–720. doi: 10.1038/nrm2762 19756040

2. Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet. 2016;17: 109–121. doi: 10.1038/nrg.2015.18 26781812

3. Overington J, Donnelly D, Johnson MS, Sali A, Blundell TL. Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. Protein Sci. 1992;1: 216–226. doi: 10.1002/pro.5560010203 1304904

4. Conant GC, Stadler PF. Solvent exposure imparts similar selective pressures across a range of yeast proteins. Mol Biol Evol. 2009;26: 1155–1161. doi: 10.1093/molbev/msp031 19233963

5. Franzosa EA, Xia Y. Structural determinants of protein evolution are context-sensitive at the residue level. Mol Biol Evol. 2009;26: 2387–2395. doi: 10.1093/molbev/msp146 19597162

6. Franzosa EA, Xue R, Xia Y. Quantitative residue-level structure-evolution relationships in the yeast membrane proteome. Genome Biol Evol. 2013;5: 734–744. doi: 10.1093/gbe/evt039 23512408

7. Sharir-Ivry A, Xia Y. The impact of native state switching on protein sequence evolution. Mol Biol Evol. 2017;34: 1378–1390. doi: 10.1093/molbev/msx071 28333346

8. Hamelryck T. An amino acid has two sides: A new 2D measure provides a different view of solvent exposure. Proteins. 2005;59: 38–48. doi: 10.1002/prot.20379 15688434

9. Yeh SW, Huang TT, Liu JW, Yu SH, Shih CH, Hwang JK, et al. Local packing density is the main structural determinant of the rate of protein sequence evolution at site level. Biomed Res Int. 2014; 572409. doi: 10.1155/2014/572409 25121105

10. Marcos ML, Echave J. Too packed to change: side-chain packing and site-specific substitution rates in protein evolution. PeerJ. 2015;3: e911. doi: 10.7717/peerj.911 25922797

11. Bartlett GJ, Porter CT, Borkakoti N, Thornton JM. Analysis of catalytic residues in enzyme active sites. J Mol Biol. 2002;324: 105–121. doi: 10.1016/s0022-2836(02)01036-7 12421562

12. Tóth-Petróczy A, Tawfik DS. Slow protein evolutionary rates are dictated by surface-core association. Proc Natl Acad Sci U S A. 2011;108: 11151–6. doi: 10.1073/pnas.1015994108 21690394

13. Jack BR, Meyer AG, Echave J, Wilke CO. Functional sites induce long-range evolutionary constraints in enzymes. PLOS Biol. 2016;14: e1002452. doi: 10.1371/journal.pbio.1002452 27138088

14. Nelson ED, Grishin N V. Evolution of off-lattice model proteins under ligand binding constraints. Phys Rev E. 2016;94: 022410. doi: 10.1103/PhysRevE.94.022410 27627338

15. Sharir-Ivry A, Xia Y. Non-catalytic Binding Sites Induce Weaker Long-Range Evolutionary Rate Gradients than Catalytic Sites in Enzymes. J Mol Biol. 2019;431: 3860–3870. doi: 10.1016/j.jmb.2019.07.019 31325440

16. Warshel A, Sharma PK, Kato M, Xiang Y, Liu H, Olsson MHM. Electrostatic basis for enzyme catalysis. Chem Rev. 2006;106: 3210–3235. doi: 10.1021/cr0503106 16895325

17. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28: 235–42. doi: 10.1093/nar/28.1.235 10592235

18. Goldenberg O, Erez E, Nimrod G, Ben-Tal N. The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res. 2009;37: D323–7. doi: 10.1093/nar/gkn822 18971256

19. Celniker G, Nimrod G, Ashkenazy H, Glaser F, Martz E, Mayrose I, et al. ConSurf: Using Evolutionary Data to Raise Testable Hypotheses about Protein Function. Isr J Chem. 2013;53: 199–206. doi: 10.1002/ijch.201200096

20. Ribeiro AJM, Holliday GL, Furnham N, Tyzack JD, Ferris K, Thornton JM. Mechanism and Catalytic Site Atlas (M-CSA): A database of enzyme reaction mechanisms and active sites. Nucleic Acids Res. 2018;46: D618–D623. doi: 10.1093/nar/gkx1012 29106569

21. Yang J, Roy A, Zhang Y. BioLiP: A semi-manually curated database for biologically relevant ligand-protein interactions. Nucleic Acids Res. 2013;41: D1096–1103. doi: 10.1093/nar/gks966 23087378

22. Liu X, Lu S, Song K, Shen Q, Ni D, Li Q, et al. Unraveling allosteric landscapes of allosterome with ASD. Nucleic Acids Res. 2020;48: D394–D401. doi: 10.1093/nar/gkz958 31665428

23. Huang Z, Zhu L, Cao Y, Wu G, Liu X, Chen Y, et al. ASD: a comprehensive database of allosteric proteins and modulators. Nucleic Acids Res. 2011;39: D663–D669. doi: 10.1093/nar/gkq1022 21051350

24. Shen Q, Wang G, Li S, Liu X, Lu S, Chen Z, et al. ASD v3.0: unraveling allosteric regulation with structural mechanisms and biological networks. Nucleic Acids Res. 2016;44: D527–D535. doi: 10.1093/nar/gkv902 26365237

25. Sharir-Ivry A, Xia Y. Nature of Long-Range Evolutionary Constraint in Enzymes: Insights from Comparison to Pseudoenzymes with Similar Structures. Mol Biol Evol. 2018;35: 2597–2606. doi: 10.1093/molbev/msy177 30202983

26. Sharir-Ivry A, Xia Y. Using Pseudoenzymes to Probe Evolutionary Design Principles of Enzymes. Evol Bioinforma. 2019;15: 117693431985593. doi: 10.1177/1176934319855937 31236007

27. Echave J. Beyond Stability Constraints: A Biophysical Model of Enzyme Evolution with Selection on Stability and Activity. Mol Biol Evol. 2019;36: 613–620. doi: 10.1093/molbev/msy244 30590616

28. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25: 3389–3402. doi: 10.1093/nar/25.17.3389 9254694

29. Cherry JM, Adler C, Ball C, Chervitz SA, Dwight SS, Hester ET, et al. SGD: Saccharomyces Genome Database. Nucleic Acids Res. 1998;26: 73–79. doi: 10.1093/nar/26.1.73 9399804

30. Hu L, Benson ML, Smith RD, Lerner MG, Carlson HA. Binding MOAD (Mother Of All Databases). Proteins Struct Funct Bioinforma. 2005;60: 333–340. doi: 10.1002/prot.20512 15971202

31. Ahmed A, Smith RD, Clark JJ, Dunbar JB, Carlson HA. Recent improvements to Binding MOAD: a resource for protein–ligand binding affinities and structures. Nucleic Acids Res. 2015;43: D465–D469. doi: 10.1093/nar/gku1088 25378330

32. Steinegger M, Söding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017;35: 1026–1028. doi: 10.1038/nbt.3988 29035372

33. Stark C, Breitkreutz B-J, Reguly T, Boucher L, Breitkreutz A, Tyers M. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006;34: D535–9. doi: 10.1093/nar/gkj109 16381927

34. Chatr-Aryamontri A, Oughtred R, Boucher L, Rust J, Chang C, Kolas NK, et al. The BioGRID interaction database: 2017 update. Nucleic Acids Res. 2017;45: D369–D379. doi: 10.1093/nar/gkw1102 27980099

35. Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N. Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002;18: S71–S77. doi: 10.1093/bioinformatics/18.suppl_1.s71 12169533

36. Wapinski I, Pfeffer A, Friedman N, Regev A. Natural history and evolutionary principles of gene duplication in fungi. Nature. 2007;449: 54–61. doi: 10.1038/nature06107 17805289

37. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013;30: 772–780. doi: 10.1093/molbev/mst010 23329690

38. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24: 1586–1591. doi: 10.1093/molbev/msm088 17483113

Článek vyšel v časopise

PLOS Genetics

2021 Číslo 4
Nejčtenější tento týden
Nejčtenější v tomto čísle
Kurzy Podcasty Doporučená témata Časopisy
Zapomenuté heslo

Nemáte účet?  Registrujte se

Zapomenuté heslo

Zadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.


Nemáte účet?  Registrujte se