Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions

Autoři: Mihail Halachev aff001;  Alison Meynert aff001;  Martin S. Taylor aff001;  Veronique Vitart aff001;  Shona M. Kerr aff001;  Lucija Klaric aff001;  ;  Timothy J. Aitman aff002;  Chris S. Haley aff001;  James G. Prendergast aff003;  Carys Pugh aff004;  David A. Hume aff005;  Sarah E. Harris aff006;  David C. Liewald aff006;  Ian J. Deary aff006;  Colin A. Semple aff001;  James F. Wilson aff001
Působiště autorů: MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom aff001;  Centre for Genomic and Experimental Medicine, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom aff002;  The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, United Kingdom aff003;  Centre for Clinical Brain Sciences, Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, United Kingdom aff004;  Mater Research Institute, University of Queensland, Woolloongabba, Australia aff005;  Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, George Square, Edinburgh, United Kingdom aff006;  Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, United Kingdom aff007
Vyšlo v časopise: Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions. PLoS Genet 15(11): e32767. doi:10.1371/journal.pgen.1008480
Kategorie: Research Article
doi: 10.1371/journal.pgen.1008480


Human population isolates provide a snapshot of the impact of historical demographic processes on population genetics. Such data facilitate studies of the functional impact of rare sequence variants on biomedical phenotypes, as strong genetic drift can result in higher frequencies of variants that are otherwise rare. We present the first whole genome sequencing (WGS) study of the VIKING cohort, a representative collection of samples from the isolated Shetland population in northern Scotland, and explore how its genetic characteristics compare to a mainland Scottish population. Our analyses reveal the strong contributions played by the founder effect and genetic drift in shaping genomic variation in the VIKING cohort. About one tenth of all high-quality variants discovered are unique to the VIKING cohort or are seen at frequencies at least ten fold higher than in more cosmopolitan control populations. Multiple lines of evidence also suggest relaxation of purifying selection during the evolutionary history of the Shetland isolate. We demonstrate enrichment of ultra-rare VIKING variants in exonic regions and for the first time we also show that ultra-rare variants are enriched within regulatory regions, particularly promoters, suggesting that gene expression patterns may diverge relatively rapidly in human isolates.

Klíčová slova:

Alleles – Europe – Chromatin – Molecular genetics – Population genetics – Promoter regions – Genetic drift – Computer-aided drug design


1. Wright AF, Carothers AD, Pirastu M. Population choice in mapping genes for complex diseases. Nat Genet. 1999;23(4):397–404. doi: 10.1038/70501 10581024

2. Kristiansson K, Naukkarinen J, Peltonen L. Isolated populations and complex disease gene identification. Genome Biol. 2008;9(8):109. doi: 10.1186/gb-2008-9-8-109 18771588

3. Kirin M, McQuillan R, Franklin CS, Campbell H, McKeigue PM, Wilson JF. Genomic runs of homozygosity record population history and consanguinity. PLoS One. 2010;5(11):e13996. doi: 10.1371/journal.pone.0013996 21085596

4. Hatzikotoulas K, Gilly A, Zeggini E. Using population isolates in genetic association studies. Brief Funct Genomics. 2014;13(5):371–7. doi: 10.1093/bfgp/elu022 25009120

5. Zeggini E. Using genetically isolated populations to understand the genomic basis of disease. Genome Med. 2014;6(10):83. doi: 10.1186/s13073-014-0083-5 25473423

6. Ober C, Tan Z, Sun Y, Possick JD, Pan L, Nicolae R, et al. Effect of Variation in CHI3L1 on Serum YKL-40 Level, Risk of Asthma, and Lung Function. N Engl J Med. 2008;358(16):1682–91. doi: 10.1056/NEJMoa0708801 18403759

7. Steinthorsdottir V, Thorleifsson G, Reynisdottir I, Benediktsson R, Jonsdottir T, Walters GB, et al. A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet. 2007;39(6):770–5. doi: 10.1038/ng2043 17460697

8. Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet. 2007;3(7):1200–10.

9. Thorleifsson G, Magnusson KP, Sulem P, Walters GB, Gudbjartsson DF, Stefansson H, et al. Common sequence variants in the LOXL1 gene confer susceptibility to exfoliation glaucoma. Science (80-). 2007;317(5843):1397–400. doi: 10.1126/science.1146554 17690259

10. Raelson J V., Little RD, Ruether A, Fournier H, Paquin B, Van Eerdewegh P, et al. Genome-wide association study for Crohn’s disease in the Quebec Founder Population identifies multiple validated disease loci. Proc Natl Acad Sci. 2007;104(37):14747–52. doi: 10.1073/pnas.0706645104 17804789

11. Chen W-M, Erdos MR, Jackson AU, Saxena R, Sanna S, Silver KD, et al. Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels. J Clin Invest. 2008;118(7):2620–8. doi: 10.1172/JCI34566 18521185

12. Styrkarsdottir U, Halldorsson B V, Gretarsdottir S, Gudbjartsson DF, Walters GB, Ingvarsson T, et al. Multiple genetic loci for bone mineral density and fractures. N Engl J Med. 2008;358(22):2355–65. doi: 10.1056/NEJMoa0801197 18445777

13. Nakatsuka N, Moorjani P, Rai N, Sarkar B, Tandon A, Patterson N, et al. The promise of discovering population-specific disease-associated genes in South Asia. Nat Genet. 2017;49(9):1403–7. doi: 10.1038/ng.3917 28714977

14. Kaiser VB, Svinti V, Prendergast JG, Chau Y-Y, Campbell A, Patarcic I, et al. Homozygous loss-of-function variants in European cosmopolitan and isolate populations. Hum Mol Genet. 2015 Oct 1;24(19):5464–74. doi: 10.1093/hmg/ddv272 26173456

15. Jeroncic A, Memari Y, Ritchie GR, Hendricks AE, Kolb-Kokocinski A, Matchan A, et al. Whole-exome sequencing in an isolated population from the Dalmatian island of Vis. Eur J Hum Genet. 2016;24(10):1479–87. doi: 10.1038/ejhg.2016.23 27049301

16. Leblond CS, Cliquet F, Carton C, Huguet G, Mathieu A, Kergrohen T, et al. Both rare and common genetic variants contribute to autism in the Faroe Islands. npj Genomic Med. 2019;4(1).

17. Gusev A, Shah MJ, Kenny EE, Ramachandran A, Lowe JK, Salit J, et al. Low-pass genome-wide sequencing and variant inference using identity-by-descent in an isolated human population. Genetics. 2012;190(2):679–89. doi: 10.1534/genetics.111.134874 22135348

18. Walter K, Min JL, Huang J, Crooks L, Memari Y, McCarthy S, et al. The UK10K project identifies rare variants in health and disease. Nature. 2015;526(7571):82–9. doi: 10.1038/nature14962 26367797

19. Xue Y, Mezzavilla M, Haber M, McCarthy S, Chen Y, Narasimhan V, et al. Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated European populations. Nat Commun. 2017;8.

20. Southam L, Gilly A, Süveges D, Farmaki AE, Schwartzentruber J, Tachmazidou I, et al. Whole genome sequencing and imputation in isolated populations identify genetic associations with medically-relevant complex traits. Nat Commun. 2017;8.

21. Chheda H, Palta P, Pirinen M, McCarthy S, Walter K, Koskinen S, et al. Whole-genome view of the consequences of a population bottleneck using 2926 genome sequences from Finland and United Kingdom. Eur J Hum Genet. 2017;25(4):477–84. doi: 10.1038/ejhg.2016.205 28145424

22. Gudbjartsson DF, Helgason H, Gudjonsson SA, Zink F, Oddson A, Gylfason A, et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet. 2015;47(5):435–44. doi: 10.1038/ng.3247 25807286

23. Gilly A, Suveges D, Kuchenbaecker K, Pollard M, Southam L, Hatzikotoulas K, et al. Cohort-wide deep whole genome sequencing and the allelic architecture of complex traits. Nat Commun. 2018;9(1):4674. doi: 10.1038/s41467-018-07070-8 30405126

24. Mooney JA, Huber CD, Service S, Sul JH, Marsden CD, Zhang Z, et al. Understanding the Hidden Complexity of Latin American Population Isolates. Am J Hum Genet. 2018;103(5):707–26. doi: 10.1016/j.ajhg.2018.09.013 30401458

25. Zuk O, Schaffner SF, Samocha K, Do R, Hechter E, Kathiresan S, et al. Searching for missing heritability: Designing rare variant association studies. Proc Natl Acad Sci. 2014;111(4):E455–64. doi: 10.1073/pnas.1322563111 24443550

26. Wainschtein P, Jain DP, Yengo L, Zheng Z, TOPMed Anthropometry Working Group, Trans-Omics for Precision Medicine Consortium, et al. Recovery of trait heritability from whole genome sequence data. bioRxiv. 2019;

27. Davies N. The isles: a history. Macmillan; 1999. 1296 p.

28. Capelli C, Redhead N, Abernethy JK, Gratrix F, Wilson JF, Moen T, et al. A Y chromosome census of the British Isles. Curr Biol. 2003;13(11):979–84. doi: 10.1016/s0960-9822(03)00373-7 12781138

29. Wilson JF, Weiss DA, Richards M, Thomas MG, Bradman N, Goldstein DB. Genetic evidence for different male and female roles during cultural transitions in the British Isles. Proc Natl Acad Sci. 2001;98(9):5078–83. doi: 10.1073/pnas.071036898 11287634

30. Goodacre S, Helgason A, Nicholson J, Southam L, Ferguson L, Hickey E, et al. Genetic evidence for a family-based Scandinavian settlement of Shetland and Orkney during the Viking periods. Heredity (Edinb). 2005;95(2):129–35. doi: 10.1038/sj.hdy.6800661 15815712

31. Vitart V, Carothers AD, Hayward C, Teague P, Hastie ND, Campbell H, et al. Increased Level of Linkage Disequilibrium in Rural Compared with Urban Communities: A Factor to Consider in Association-Study Design. Am J Hum Genet. 2005;76(5):763–72. doi: 10.1086/429840 15791542

32. Gilbert E, O’Reilly S, Merrigan M, McGettigan D, Vitart V, Joshi PK, et al. The genetic landscape of Scotland and the Isles. PNAS. 2019;116(38):19064–19070. doi: 10.1073/pnas.1904761116 31481615

33. VIKING Project [Internet]. [cited 2019 Aug 1]. Available from:

34. Glodzik D, Navarro P, Vitart V, Hayward C, Mcquillan R, Wild SH, et al. Inference of identity by descent in population isolates and optimal sequencing studies. Eur J Hum Genet. 2013;21(10):1140–5. doi: 10.1038/ejhg.2012.307 23361219

35. Taylor AM, Pattie A, Deary IJ. Cohort Profile Update: The Lothian Birth Cohorts of 1921 and 1936. Int J Epidemiol. 2018;47(4):1042–1042r. doi: 10.1093/ije/dyy022 29546429

36. Deary IJ, Gow AJ, Pattie A, Starr JM. Cohort profile: The lothian birth cohorts of 1921 and 1936. Int J Epidemiol. 2012;41(6):1576–84. doi: 10.1093/ije/dyr197 22253310

37. LBC Project [Internet]. [cited 2019 Aug 1]. Available from:

38. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. doi: 10.1101/gr.107524.110 20644199

39. Lek M, Karczewski KJ, Minikel E V, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285–91. doi: 10.1038/nature19057 27535533

40. Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, et al. Ensembl 2018. Nucleic Acids Res. 2018;46(D1):D754–61. doi: 10.1093/nar/gkx1098 29155950

41. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473(7345):43–9. doi: 10.1038/nature09906 21441907

42. Mayr E. Systematics and the Origin of Species from the Viewpoint of a Zoologist. Harvard University Press; 1999. 372 p.

43. Wang SR, Agarwala V, Flannick J, Chiang CWK, Altshuler D, Hirschhorn JN. Simulation of finnish population history, guided by empirical genetic data, to assess power of rare-variant tests in Finland. Am J Hum Genet. 2014;94(5):710–20. doi: 10.1016/j.ajhg.2014.03.019 24768551

44. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123(3):585–95. 2513255

45. Neininger K, Marschall T, Helms V. SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome. PLoS One. 2019;14(4):e0214816. doi: 10.1371/journal.pone.0214816 30978217

46. Pemberton TJ, Absher D, Feldman MW, Myers RM, Rosenberg NA, Li JZ. Genomic patterns of homozygosity in worldwide human populations. Am J Hum Genet. 2012;91(2):275–92. doi: 10.1016/j.ajhg.2012.06.014 22883143

47. Szpiech ZA, Xu J, Pemberton TJ, Peng W, Zöllner S, Rosenberg NA, et al. Long runs of homozygosity are enriched for deleterious variation. Am J Hum Genet. 2013;93(1):90–102. doi: 10.1016/j.ajhg.2013.05.003 23746547

48. McQuillan R, Leutenegger AL, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, et al. Runs of Homozygosity in European Populations. Am J Hum Genet. 2008;83(3):359–72. doi: 10.1016/j.ajhg.2008.08.007 18760389

49. Ceballos FC, Joshi PK, Clark DW, Ramsay M, Wilson JF. Runs of homozygosity: Windows into population history and trait architecture. Nat Rev Genet. 2018;19(4):220–34. doi: 10.1038/nrg.2017.109 29335644

50. Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47(D1):D886–94. doi: 10.1093/nar/gky1016 30371827

51. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019;

52. Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46(Database issue):D1062–D1067. doi: 10.1093/nar/gkx1153 29165669

53. GWAS Catalog [Internet]. [cited 2018 Dec 1]. Available from:

54. Carithers L, Ardlie K, Barcus M, Branton P, Britton A, Buia S, et al. A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project. Biopreserv Biobank. 2015;13(5):311–9. doi: 10.1089/bio.2015.0032 26484571

55. GTEx (v7) [Internet]. [cited 2018 Dec 1]. Available from:

56. Pedersen CET, Lohmueller KE, Grarup N, Bjerregaard P, Hansen T, Siegismund HR, et al. The effect of an extreme and prolonged population bottleneck on patterns of deleterious variation: Insights from the Greenlandic Inuit. Genetics. 2017;205(2):787–801. doi: 10.1534/genetics.116.193821 27903613

57. Margaryan A, Lawson DJ, Sikora M, Racimo F, Rasmussen S, Moltke I, et al. Population genomics of the Viking world. bioRxiv. 2019;

58. Sudmant PH, Mallick S, Nelson BJ, Hormozdiari F, Krumm N, Huddleston J, et al. Global diversity, population stratification, and selection of human copy-number variation. Science (80-). 2015;349(6253):aab3761. doi: 10.1126/science.aab3761 26249230

59. Taylor MS, Kai C, Kawai J, Carninci P, Hayashizaki Y, Semple CAM. Heterotachy in mammalian promoter evolution. PLoS Genet. 2006;2(4):627–39.

60. Young RS, Hayashizaki Y, Andersson R, Sandelin A, Kawaji H, Itoh M, et al. The frequent evolutionary birth and death of functional promoters in mouse and human. Genome Res. 2015;25(10):1546–57. doi: 10.1101/gr.190546.115 26228054

61. Kindt ASD, Navarro P, Semple CAM, Haley CS. The genomic signature of trait-associated variants. BMC Genomics. 2013;14:108. doi: 10.1186/1471-2164-14-108 23418889

62. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics. 2009;25(5):1754–60.

63. Faust GG, Hall IM. SAMBLASTER: Fast duplicate marking and structural variant read extraction. Bioinformatics. 2014;30(17):2503–5. doi: 10.1093/bioinformatics/btu314 24812344

64. Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants. Bioinformatics. 2015;31(13):2202–4. doi: 10.1093/bioinformatics/btv112 25701572

65. GATK Hard Filtering [Internet]. [cited 2017 Jun 1]. Available from:

66. CRg dataset (36mers) [Internet]. [cited 2017 Jul 1]. Available from:

67. Duke dataset (35mers) [Internet]. [cited 2017 Jul 1]. Available from:

68. DAC dataset [Internet]. [cited 2017 Jul 1]. Available from:

69. Purcell SM, Chang CC, Chow CC, Tellier LC, Lee JJ, Vattikuti S. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4(7).

70. Staples J, Qiao D, Cho MH, Silverman EK, Nickerson DA, Below JE. PRIMUS: Rapid reconstruction of pedigrees from genome-wide estimates of identity by descent. Am J Hum Genet. 2014;95(5):553–64. doi: 10.1016/j.ajhg.2014.10.005 25439724

71. Auton A, Abecasis GR, Altshuler DM, Durbin RM, Bentley DR, Chakravarti A, et al. A global reference for human genetic variation. Nature. 2015 Oct 30;526(7571):68–74. doi: 10.1038/nature15393 26432245

72. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The Ensembl Variant Effect Predictor. Genome Biol. 2016;17(1):122. doi: 10.1186/s13059-016-0974-4 27268795

73. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–1664. doi: 10.1101/gr.094052.109 19648217

74. ADMIXTURE tool [Internet]. [cited 2019 Aug 1]. Available from:

75. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–2158. doi: 10.1093/bioinformatics/btr330 21653522

76. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987–93. doi: 10.1093/bioinformatics/btr509 21903627

77. 15 chromatin states data tracks [Internet]. [cited 2018 Nov 1]. Available from:

78. pLI and z-score file [Internet]. [cited 2017 Oct 1]. Available from:

Genetika Reprodukční medicína

Článek vyšel v časopise

PLOS Genetics

2019 Číslo 11

Nejčtenější v tomto čísle

Tomuto tématu se dále věnují…

Kurzy Doporučená témata Časopisy
Zapomenuté heslo

Nemáte účet?  Registrujte se

Zapomenuté heslo

Zadejte e-mailovou adresu se kterou jste vytvářel(a) účet, budou Vám na ni zaslány informace k nastavení nového hesla.


Nemáte účet?  Registrujte se

VIRTUÁLNÍ ČEKÁRNA ČR Jste praktický lékař nebo pediatr? Zapojte se! Jste praktik nebo pediatr? Zapojte se!