Tracking human population structure through time from whole genome sequences

Autoři: Ke Wang aff001;  Iain Mathieson aff002;  Jared O’Connell aff003;  Stephan Schiffels aff001
Působiště autorů: Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany aff001;  Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United State of America aff002;  Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America aff002;  23andMe Inc., Mountain View, California, United State of America aff003;  23andMe Inc., Mountain View, California, United States of America aff003
Vyšlo v časopise: Tracking human population structure through time from whole genome sequences. PLoS Genet 16(3): e32767. doi:10.1371/journal.pgen.1008552
Kategorie: Research Article
doi: 10.1371/journal.pgen.1008552


The genetic diversity of humans, like many species, has been shaped by a complex pattern of population separations followed by isolation and subsequent admixture. This pattern, reaching at least as far back as the appearance of our species in the paleontological record, has left its traces in our genomes. Reconstructing a population’s history from these traces is a challenging problem. Here we present a novel approach based on the Multiple Sequentially Markovian Coalescent (MSMC) to analyze the separation history between populations. Our approach, called MSMC-IM, uses an improved implementation of the MSMC (MSMC2) to estimate coalescence rates within and across pairs of populations, and then fits a continuous Isolation-Migration model to these rates to obtain a time-dependent estimate of gene flow. We show, using simulations, that our method can identify complex demographic scenarios involving post-split admixture or archaic introgression. We apply MSMC-IM to whole genome sequences from 15 worldwide populations, tracking the process of human genetic diversification. We detect traces of extremely deep ancestry between some African populations, with around 1% of ancestry dating to divergences older than a million years ago.

Klíčová slova:

DNA recombination – Gene flow – Genomic libraries – Haplotypes – Human genomics – Introgression – Population size – Simulation and modeling


1. McVean GAT, Cardin NJ. Approximating the coalescent with recombination. Philos Trans R Soc Lond B Biol Sci. 2005;360: 1387–1393. doi: 10.1098/rstb.2005.1673 16048782

2. Marjoram P, Wall JD. Fast “coalescent” simulation. BMC Genet. 2006;7: 16. doi: 10.1186/1471-2156-7-16 16539698

3. Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475: 493–496. doi: 10.1038/nature10231 21753753

4. Schiffels S, Durbin R. Inferring human population size and separation history from multiple genome sequences. Nat Genet. 2014;46: 919–925. doi: 10.1038/ng.3015 24952747

5. Steinrücken M, Kamm JA, Song YS. Inference of complex population histories using whole-genome sequences from multiple populations. Cold Spring Harbor Labs Journals; 2015 Sep. Available:

6. Sheehan S, Harris K, Song YS. Estimating variable effective population sizes from multiple genomes: a sequentially markov conditional sampling distribution approach. 2013;194: 647–662. doi: 10.1534/genetics.112.149096 23608192

7. Terhorst J, Kamm JA, Song YS. Robust and scalable inference of population history from hundreds of unphased whole genomes. Nat Genet. 2017;49: 303–309. doi: 10.1038/ng.3748 28024154

8. Kamm JA, Terhorst J, Song YS. Efficient computation of the joint sample frequency spectra for multiple populations. J Comput Graph Stat. 2017;26: 182–194. doi: 10.1080/10618600.2016.1159212 28239248

9. Kamm J, Terhorst J, Durbin R, Song YS. Efficiently Inferring the Demographic History of Many Populations With Allele Count Data. J Am Stat Assoc. 2019; 1–16. doi: 10.1080/01621459.2019.1635482

10. Excoffier L, Foll M. fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics. 2011;27: 1332–1334. doi: 10.1093/bioinformatics/btr124 21398675

11. Schiffels S, Haak W, Paajanen P, Llamas B, Popescu E, Loe L, et al. Iron Age and Anglo-Saxon genomes from East England reveal British migration history. Nat Commun. 2016;7: 10408. doi: 10.1038/ncomms10408 26783965

12. Mallick S, Li H, Lipson M, Mathieson I, Gymrek M, Racimo F, et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature. 2016;538: 201–206. doi: 10.1038/nature18964 27654912

13. Malaspinas A-S, Westaway MC, Muller C, Sousa VC, Lao O, Alves I, et al. A genomic history of Aboriginal Australia. Nature. 2016;538: 207–214. doi: 10.1038/nature18299 27654914

14. Prüfer K, Racimo F, Patterson N, Jay F, Sankararaman S, Sawyer S, et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 2014;505: 43–49. doi: 10.1038/nature12886 24352235

15. Plagnol V, Wall JD. Possible ancestral structure in human populations. PLoS Genet. 2006;2: e105. doi: 10.1371/journal.pgen.0020105 16895447

16. Durvasula A, Sankararaman S. Recovering signals of ghost archaic admixture in the genomes of present-day Africans. bioRxiv. 2018. p. 285734. doi: 10.1101/285734

17. Skoglund P, Thompson JC, Prendergast ME, Mittnik A, Sirak K, Hajdinjak M, et al. Reconstructing Prehistoric African Population Structure. Cell. 2017;171: 59–71.e21. doi: 10.1016/j.cell.2017.08.049 28938123

18. Sankararaman S, Mallick S, Dannemann M, Prüfer K, Kelso J, Pääbo S, et al. The genomic landscape of Neanderthal ancestry in present-day humans. Nature. 2014;507: 354–357. doi: 10.1038/nature12961 24476815

19. Sankararaman S, Mallick S, Patterson N, Reich D. The Combined Landscape of Denisovan and Neanderthal Ancestry in Present-Day Humans. Curr Biol. 2016;26: 1241–1247. doi: 10.1016/j.cub.2016.03.037 27032491

20. Browning SR, Browning BL, Zhou Y, Tucci S, Akey JM. Analysis of Human Sequence Data Reveals Two Pulses of Archaic Denisovan Admixture. Cell. 2018;173: 53–61.e9. doi: 10.1016/j.cell.2018.02.031 29551270

21. Pagani L, Lawson DJ, Jagoda E, Mörseburg A, Eriksson A, Mitt M, et al. Genomic analyses inform on migration events during the peopling of Eurasia. Nature. 2016;538: 238–242. doi: 10.1038/nature19792 27654910

22. Meyer M, Kircher M, Gansauge M-T, Li H, Racimo F, Mallick S, et al. A high-coverage genome sequence from an archaic Denisovan individual. Science. 2012;338: 222–226. doi: 10.1126/science.1224344 22936568

23. Raghavan M, Skoglund P, Graf KE, Metspalu M, Albrechtsen A, Moltke I, et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature. 2014;505: 87–91. doi: 10.1038/nature12736 24256729

24. Delaneau O, Zagury J-F, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10: 5–6. doi: 10.1038/nmeth.2307 23269371

25. Browning SR, Browning BL. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet. 2007;81: 1084–1097. doi: 10.1086/521987 17924348

26. Loh P-R, Danecek P, Palamara PF, Fuchsberger C, A Reshef Y, K Finucane H, et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat Genet. 2016;48: 1443–1448. doi: 10.1038/ng.3679 27694958

27. Delaneau O, Howie B, Cox AJ, Zagury J-F, Marchini J. Haplotype estimation using sequencing reads. Am J Hum Genet. 2013;93: 687–696. doi: 10.1016/j.ajhg.2013.09.002 24094745

28. Choi Y, Chan AP, Kirkness E, Telenti A, Schork NJ. Comparison of phasing strategies for whole human genomes. PLoS Genet. 2018;14: e1007308. doi: 10.1371/journal.pgen.1007308 29621242

29. Song S, Sliwerska E, Emery S, Kidd JM. Modeling Human Population Separation History Using Physically Phased Genomes. Genetics. 2017;205: 385–395. doi: 10.1534/genetics.116.192963 28049708

30. Pickrell JK, Patterson N, Barbieri C, Berthold F, Gerlach L, Güldemann T, et al. The genetic prehistory of southern Africa. Nat Commun. 2012;3: 1143. doi: 10.1038/ncomms2140 23072811

31. Tishkoff SA, Gonder MK, Henn BM, Mortensen H, Knight A, Gignoux C, et al. History of click-speaking populations of Africa inferred from mtDNA and Y chromosome genetic variation. Mol Biol Evol. 2007;24: 2180–2195. doi: 10.1093/molbev/msm155 17656633

32. Knight A, Underhill PA, Mortensen HM, Zhivotovsky LA, Lin AA, Henn BM, et al. African Y chromosome and mtDNA divergence provides insight into the history of click languages. Curr Biol. 2003;13: 464–473. doi: 10.1016/s0960-9822(03)00130-1 12646128

33. Schlebusch CM, Skoglund P, Sjödin P, Gattepaille LM, Hernandez D, Jay F, et al. Genomic variation in seven Khoe-San groups reveals adaptation and complex African history. Science. 2012;338: 374–379. doi: 10.1126/science.1227721 22997136

34. Schlebusch CM, Jakobsson M. Tales of Human Migration, Admixture, and Selection in Africa. Annu Rev Genomics Hum Genet. 2018. doi: 10.1146/annurev-genom-083117-021759 29727585

35. McDougall I, Brown FH, Fleagle JG. Stratigraphic placement and age of modern humans from Kibish, Ethiopia. Nature. 2005;433: 733–736. doi: 10.1038/nature03258 15716951

36. White TD, Asfaw B, DeGusta D, Gilbert H, Richards GD, Suwa G, et al. Pleistocene Homo sapiens from Middle Awash, Ethiopia. Nature. 2003;423: 742–747. doi: 10.1038/nature01669 12802332

37. Richter D, Grün R, Joannes-Boyau R, Steele TE, Amani F, Rué M, et al. The age of the hominin fossils from Jebel Irhoud, Morocco, and the origins of the Middle Stone Age. Nature. 2017;546: 293–296. doi: 10.1038/nature22335 28593967

38. Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, et al. Ancient admixture in human history. Genetics. 2012;192: 1065–1093. doi: 10.1534/genetics.112.145037 22960212

39. Hobolth A, Andersen LN, Mailund T. On computing the coalescence time density in an isolation-with migration model with few samples. Genetics. 2011. pp. 1241–1243. doi: 10.1534/genetics.110.124164 21321131

40. Kelleher J, Etheridge AM, McVean G. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. PLoS Comput Biol. 2016;12: e1004842. doi: 10.1371/journal.pcbi.1004842 27145223

Článek vyšel v časopise

PLOS Genetics

2020 Číslo 3
Nejčtenější tento týden
Nejčtenější v tomto čísle
Kurzy Podcasty Doporučená témata Časopisy
Zapomenuté heslo

Nemáte účet?  Registrujte se

Zapomenuté heslo

Zadejte e-mailovou adresu, se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.


Nemáte účet?  Registrujte se