Genetic codes optimized as a traveling salesman problem

Autoři: Oliver Attie aff001;  Brian Sulkow aff001;  Chong Di aff001;  Weigang Qiu aff001
Působiště autorů: Department of Biological Sciences, Hunter College, City University of New York, New York, United States of America aff001;  Graduate Center, City University of New York, New York, United States of America aff002;  Department of Physiology and Biophysics & Institute for Computational Biomedicine, Weil Cornell Medical College, New York, New York, United States of America aff003
Vyšlo v časopise: PLoS ONE 14(10)
Kategorie: Research Article


The Standard Genetic Code (SGC) is robust to mutational errors such that frequently occurring mutations minimally alter the physio-chemistry of amino acids. The apparent correlation between the evolutionary distances among codons and the physio-chemical distances among their cognate amino acids suggests an early co-diversification between the codons and amino acids. Here we formulated the co-minimization of evolutionary distances between codons and physio-chemical distances between amino acids as a Traveling Salesman Problem (TSP) and solved it with a Hopfield neural network. In this unsupervised learning algorithm, macromolecules (e.g., tRNAs and aminoacyl-tRNA synthetases) associating codons with amino acids were considered biological analogs of Hopfield neurons associating “tour cities” with “tour positions”. The Hopfield network efficiently yielded an abundance of genetic codes that were more error-minimizing than SGC and could thus be used to design artificial genetic codes. We further argue that as a self-optimization algorithm, the Hopfield neural network provides a model of origin of SGC and other adaptive molecular systems through evolutionary learning.

Klíčová slova:

Evolutionary genetics – Genetic networks – Learning – Natural selection – Neural networks – Neurons – Transfer RNA – Genetic code


1. Darwin C. The Origin of Species. P. F. Collier & Son; 1909.

2. Wallace AR. Contributions to the Theory of Natural Selection. Macmillan and Co.; 1871.

3. Charlesworth D, Barton NH, Charlesworth B. The sources of adaptive variation. Proc Biol Sci. 2017;284. doi: 10.1098/rspb.2016.2864 28566483

4. Huxley J. Evolution: the modern synthesis. Allen and Unwin; 1974.

5. Kimura M. The Neutral Theory of Molecular Evolution. Cambridge University Press; 1984.

6. Pak D, Du N, Kim Y, Sun Y, Burton ZF. Rooted tRNAomes and evolution of the genetic code. Transcription. 2018;9: 137–151. doi: 10.1080/21541264.2018.1429837 29372672

7. Pigliucci M. Do we need an extended evolutionary synthesis? Evol Int J Org Evol. 2007;61: 2743–2749. doi: 10.1111/j.1558-5646.2007.00246.x 17924956

8. Smith JM, Szathmary E. The Major Transitions in Evolution. OUP Oxford; 1997.

9. Chaitin G. Proving Darwin: Making Biology Mathematical. Vintage Books; 2013.

10. Valiant L. Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World. Basic Books; 2013.

11. Watson RA, Mills R, Buckley CL, Kouvaris K, Jackson A, Powers ST, et al. Evolutionary Connectionism: Algorithmic Principles Underlying the Evolution of Biological Organisation in Evo-Devo, Evo-Eco and Evolutionary Transitions. Evol Biol. 2016;43: 553–581. doi: 10.1007/s11692-015-9358-z 27932852

12. Pak D, Kim Y, Burton ZF. Aminoacyl-tRNA synthetase evolution and sectoring of the genetic code. Transcription. 2018;9: 205–224. doi: 10.1080/21541264.2018.1467718 29727262

13. Chastain E, Livnat A, Papadimitriou C, Vazirani U. Algorithms, games, and evolution. Proc Natl Acad Sci U S A. 2014;111: 10620–10623. doi: 10.1073/pnas.1406556111 24979793

14. Papadimitriou C. Algorithms, complexity, and the sciences. Proc Natl Acad Sci U S A. 2014;111: 15881–15887. doi: 10.1073/pnas.1416954111 25349382

15. Watson RA, Szathmáry E. How Can Evolution Learn? Trends Ecol Evol. 2016;31: 147–157. doi: 10.1016/j.tree.2015.11.009 26705684

16. Hebb DO. The Organization of Behavior: A Neuropsychological Theory. Taylor & Francis; 2002.

17. Vey G. Gene coexpression as Hebbian learning in prokaryotic genomes. Bull Math Biol. 2013;75: 2431–2449. doi: 10.1007/s11538-013-9900-z 24078338

18. Nelson DL, Cox MM. Lehninger Principles of Biochemistry. 4th ed. Macmillan; 2005.

19. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982;157: 105–132. doi: 10.1016/0022-2836(82)90515-0 7108955

20. Novozhilov AS, Wolf YI, Koonin EV. Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape. Biol Direct. 2007;2: 24. doi: 10.1186/1745-6150-2-24 17956616

21. Błażej P, Wnętrzak M, Mackiewicz D, Gagat P, Mackiewicz P. Many alternative and theoretical genetic codes are more robust to amino acid replacements than the standard genetic code. J Theor Biol. 2019;464: 21–32. doi: 10.1016/j.jtbi.2018.12.030 30579955

22. Santos J, Monteagudo A. Simulated evolution applied to study the genetic code optimality using a model of codon reassignments. BMC Bioinformatics. 2011;12: 56. doi: 10.1186/1471-2105-12-56 21338505

23. Wnętrzak M, Błażej P, Mackiewicz D, Mackiewicz P. The optimality of the standard genetic code assessed by an eight-objective evolutionary algorithm. BMC Evol Biol. 2018;18: 192. doi: 10.1186/s12862-018-1304-0 30545289

24. Błażej P, Wnȩtrzak M, Mackiewicz P. The role of crossover operator in evolutionary-based approach to the problem of genetic code optimization. Biosystems. 2016;150: 61–72. doi: 10.1016/j.biosystems.2016.08.008 27555085

25. Błażej P, Wnętrzak M, Mackiewicz D, Mackiewicz P. Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm. PloS One. 2018;13: e0201715. doi: 10.1371/journal.pone.0201715 30092017

26. Freeland SJ, Knight RD, Landweber LF, Hurst LD. Early fixation of an optimal genetic code. Mol Biol Evol. 2000;17: 511–518. doi: 10.1093/oxfordjournals.molbev.a026331 10742043

27. Goldenfeld N, Biancalani T, Jafarpour F. Universal biology and the statistical mechanics of early life. Philos Transact A Math Phys Eng Sci. 2017;375. doi: 10.1098/rsta.2016.0341 29133441

28. José MV, Zamudio GS, Morgado ER. A unified model of the standard genetic code. R Soc Open Sci. 2017;4: 160908. doi: 10.1098/rsos.160908 28405378

29. José MV, Govezensky T, García JA, Bobadilla JR. On the evolution of the standard genetic code: vestiges of critical scale invariance from the RNA world in current prokaryote genomes. PloS One. 2009;4: e4340. doi: 10.1371/journal.pone.0004340 19183813

30. Koonin EV. Frozen Accident Pushing 50: Stereochemistry, Expansion, and Chance in the Evolution of the Genetic Code. Life Basel Switz. 2017;7. doi: 10.3390/life7020022 28545255

31. Koonin EV, Novozhilov AS. Origin and Evolution of the Universal Genetic Code. Annu Rev Genet. 2017;51: 45–62. doi: 10.1146/annurev-genet-120116-024713 28853922

32. Di Giulio M. An Autotrophic Origin for the Coded Amino Acids is Concordant with the Coevolution Theory of the Genetic Code. J Mol Evol. 2016;83: 93–96. doi: 10.1007/s00239-016-9760-x 27743002

33. Massey SE. The neutral emergence of error minimized genetic codes superior to the standard genetic code. J Theor Biol. 2016;408: 237–242. doi: 10.1016/j.jtbi.2016.08.022 27544417

34. Pak D, Root-Bernstein R, Burton ZF. tRNA structure and evolution and standardization to the three nucleotide genetic code. Transcription. 2017;8: 205–219. doi: 10.1080/21541264.2017.1318811 28632998

35. Kim Y, Kowiatek B, Opron K, Burton ZF. Type-II tRNAs and Evolution of Translation Systems and the Genetic Code. Int J Mol Sci. 2018;19. doi: 10.3390/ijms19103275 30360357

36. Opron K, Burton ZF. Ribosome Structure, Function, and Early Evolution. Int J Mol Sci. 2018;20. doi: 10.3390/ijms20010040 30583477

37. Chin JW. Expanding and reprogramming the genetic code. Nature. 2017;550: 53–60. doi: 10.1038/nature24031 28980641

38. Xue H, Wong JT-F. Future of the Genetic Code. Life. 2017;7: 10. doi: 10.3390/life7010010 28264473

39. de Oliveira LL, de Oliveira PSL, Tinós R. A multiobjective approach to the genetic code adaptability problem. BMC Bioinformatics. 2015;16: 52. doi: 10.1186/s12859-015-0480-9 25879480

40. Santos J, Monteagudo Á. Inclusion of the fitness sharing technique in an evolutionary algorithm to analyze the fitness landscape of the genetic code adaptability. BMC Bioinformatics. 2017;18: 195. doi: 10.1186/s12859-017-1608-x 28347270

41. Tlusty T. A colorful origin for the genetic code: information theory, statistical mechanics and the emergence of molecular codes. Phys Life Rev. 2010;7: 362–376. doi: 10.1016/j.plrev.2010.06.002 20558115

42. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci. 1982;79: 2554–2558. doi: 10.1073/pnas.79.8.2554 6953413

43. MacKay DJC. Information Theory, Inference and Learning Algorithms. Cambridge University Press; 2003.

44. Hopfield JJ, Tank DW. Computing with neural circuits: a model. Science. 1986;233: 625–633. doi: 10.1126/science.3755256 3755256

45. Potvin J-Y. State-of-the-Art Survey—The Traveling Salesman Problem: A Neural Network Perspective. ORSA J Comput. 1993;5: 328–348. doi: 10.1287/ijoc.5.4.328

46. Bout DEV den Miller TK. Improving the performance of the Hopfield-Tank neural network through normalization and annealing. Biol Cybern. 1989;62: 129–139. doi: 10.1007/BF00203001

47. Haig D, Hurst LD. A quantitative measure of error minimization in the genetic code. J Mol Evol. 1991;33: 412–417. doi: 10.1007/bf02103132 1960738

48. Jühling F, Mörl M, Hartmann RK, Sprinzl M, Stadler PF, Pütz J. tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res. 2009;37: D159–D162. doi: 10.1093/nar/gkn772 18957446

49. Price MN, Dehal PS, Arkin AP. FastTree 2—approximately maximum-likelihood trees for large alignments. PloS One. 2010;5: e9490. doi: 10.1371/journal.pone.0009490 20224823

50. Hernández Y, Bernstein R, Pagan P, Vargas L, McCaig W, Ramrattan G, et al. BpWrapper: BioPerl-based sequence and tree utilities for rapid prototyping of bioinformatics pipelines. BMC Bioinformatics. 2018;19: 76. doi: 10.1186/s12859-018-2074-9 29499649

51. Paradis E, Claude J, Strimmer K. APE: Analyses of Phylogenetics and Evolution in R language. Bioinforma Oxf Engl. 2004;20: 289–290.

52. Gittleman JL, Kot M. Adaptation: Statistics and a Null Model for Estimating Phylogenetic Effects. Syst Biol. 1990;39: 227–241. doi: 10.2307/2992183

53. Moran PAP. Notes on Continuous Stochastic Phenomena. Biometrika. 1950;37: 17–23. doi: 10.2307/2332142 15420245

54. Dray S, Dufour A-B. The ade4 Package: Implementing the Duality Diagram for Ecologists. J Stat Softw. 2007;22. doi: 10.18637/jss.v022.i04

55. Zamudio GS, José MV. On the Uniqueness of the Standard Genetic Code. Life Basel Switz. 2017;7. doi: 10.3390/life7010007 28208827

56. Rodin SN, Rodin AS. On the origin of the genetic code: signatures of its primordial complementarity in tRNAs and aminoacyl-tRNA synthetases. Heredity. 2008;100: 341–355. doi: 10.1038/sj.hdy.6801086 18322459

57. Saks ME, Sampson JR, Abelson J. Evolution of a transfer RNA gene through a point mutation in the anticodon. Science. 1998;279: 1665–1670. doi: 10.1126/science.279.5357.1665 9497276

58. Massimo DiGiulio M.Rosaria Capobianco, Mario Medugno. On the optimization of the physicochemical distances between amino acids in the evolution of the genetic code. J Theor Biol. 1994;168: 31–41. doi: 10.1006/jtbi.1994.1085

59. Applegate D, Bixby R, Chvátal V, Cook W. The Traveling Salesman Problem. In: Princeton University Press [Internet]. 2007 [cited 4 Mar 2019]. Available:

60. Davis BK. Evolution of the genetic code. Prog Biophys Mol Biol. 1999;72: 157–243. 10511799

61. Osawa S, Jukes TH. Codon reassignment (codon capture) in evolution. J Mol Evol. 1989;28: 271–278. doi: 10.1007/bf02103422 2499683

62. Stoltzfus A, Yampolsky LY. Amino acid exchangeability and the adaptive code hypothesis. J Mol Evol. 2007;65: 456–462. doi: 10.1007/s00239-007-9026-8 17896070

63. Hopfield JJ. Origin of the genetic code: a testable hypothesis based on tRNA structure, sequence, and kinetic proofreading. Proc Natl Acad Sci. 1978;75: 4334–4338. doi: 10.1073/pnas.75.9.4334 279919

64. Dittmar K, Liberles D. Evolution after Gene Duplication. John Wiley & Sons; 2011.

65. Graur D. Molecular and Genome Evolution. Sinauer; 2015.

66. Ohno S. Evolution by Gene Duplication. Springer Science & Business Media; 2013.

67. Holland PWH. Did homeobox gene duplications contribute to the Cambrian explosion? Zool Lett. 2015;1: 1. doi: 10.1186/s40851-014-0004-x 26605046

68. Hoover KC. Evolution of olfactory receptors. Methods Mol Biol Clifton NJ. 2013;1003: 241–249. doi: 10.1007/978-1-62703-377-0_18 23585047

69. Hopfield JJ. Odor space and olfactory processing: Collective algorithms and neural implementation. Proc Natl Acad Sci. 1999;96: 12506–12511. doi: 10.1073/pnas.96.22.12506 10535952

70. Naz R, Tahir S, Abbasi AA. An insight into the evolutionary history of human MHC paralogon. Mol Phylogenet Evol. 2017;110: 1–6. doi: 10.1016/j.ympev.2017.02.015 28249742

71. Norris SJ. vls Antigenic Variation Systems of Lyme Disease Borrelia: Eluding Host Immunity through both Random, Segmental Gene Conversion and Framework Heterogeneity. Microbiol Spectr. 2014;2. doi: 10.1128/microbiolspec.MDNA3-0038-2014 26104445

72. Taylor JE, Rudenko G. Switching trypanosome coats: what’s in the wardrobe? Trends Genet TIG. 2006;22: 614–620. doi: 10.1016/j.tig.2006.08.003 16908087

Článek vyšel v časopise


2019 Číslo 10
Nejčtenější tento týden