Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions
Autoři:
Mihail Halachev aff001; Alison Meynert aff001; Martin S. Taylor aff001; Veronique Vitart aff001; Shona M. Kerr aff001; Lucija Klaric aff001; ; Timothy J. Aitman aff002; Chris S. Haley aff001; James G. Prendergast aff003; Carys Pugh aff004; David A. Hume aff005; Sarah E. Harris aff006; David C. Liewald aff006; Ian J. Deary aff006; Colin A. Semple aff001; James F. Wilson aff001
Působiště autorů:
MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
aff001; Centre for Genomic and Experimental Medicine, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
aff002; The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, United Kingdom
aff003; Centre for Clinical Brain Sciences, Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, United Kingdom
aff004; Mater Research Institute, University of Queensland, Woolloongabba, Australia
aff005; Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, George Square, Edinburgh, United Kingdom
aff006; Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, United Kingdom
aff007
Vyšlo v časopise:
Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions. PLoS Genet 15(11): e32767. doi:10.1371/journal.pgen.1008480
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1008480
Souhrn
Human population isolates provide a snapshot of the impact of historical demographic processes on population genetics. Such data facilitate studies of the functional impact of rare sequence variants on biomedical phenotypes, as strong genetic drift can result in higher frequencies of variants that are otherwise rare. We present the first whole genome sequencing (WGS) study of the VIKING cohort, a representative collection of samples from the isolated Shetland population in northern Scotland, and explore how its genetic characteristics compare to a mainland Scottish population. Our analyses reveal the strong contributions played by the founder effect and genetic drift in shaping genomic variation in the VIKING cohort. About one tenth of all high-quality variants discovered are unique to the VIKING cohort or are seen at frequencies at least ten fold higher than in more cosmopolitan control populations. Multiple lines of evidence also suggest relaxation of purifying selection during the evolutionary history of the Shetland isolate. We demonstrate enrichment of ultra-rare VIKING variants in exonic regions and for the first time we also show that ultra-rare variants are enriched within regulatory regions, particularly promoters, suggesting that gene expression patterns may diverge relatively rapidly in human isolates.
Klíčová slova:
Chromatin – Molecular genetics – Europe – Alleles – Population genetics – Promoter regions – Genetic drift – Computer-aided drug design
Zdroje
1. Wright AF, Carothers AD, Pirastu M. Population choice in mapping genes for complex diseases. Nat Genet. 1999;23(4):397–404. doi: 10.1038/70501 10581024
2. Kristiansson K, Naukkarinen J, Peltonen L. Isolated populations and complex disease gene identification. Genome Biol. 2008;9(8):109. doi: 10.1186/gb-2008-9-8-109 18771588
3. Kirin M, McQuillan R, Franklin CS, Campbell H, McKeigue PM, Wilson JF. Genomic runs of homozygosity record population history and consanguinity. PLoS One. 2010;5(11):e13996. doi: 10.1371/journal.pone.0013996 21085596
4. Hatzikotoulas K, Gilly A, Zeggini E. Using population isolates in genetic association studies. Brief Funct Genomics. 2014;13(5):371–7. doi: 10.1093/bfgp/elu022 25009120
5. Zeggini E. Using genetically isolated populations to understand the genomic basis of disease. Genome Med. 2014;6(10):83. doi: 10.1186/s13073-014-0083-5 25473423
6. Ober C, Tan Z, Sun Y, Possick JD, Pan L, Nicolae R, et al. Effect of Variation in CHI3L1 on Serum YKL-40 Level, Risk of Asthma, and Lung Function. N Engl J Med. 2008;358(16):1682–91. doi: 10.1056/NEJMoa0708801 18403759
7. Steinthorsdottir V, Thorleifsson G, Reynisdottir I, Benediktsson R, Jonsdottir T, Walters GB, et al. A variant in CDKAL1 influences insulin response and risk of type 2 diabetes. Nat Genet. 2007;39(6):770–5. doi: 10.1038/ng2043 17460697
8. Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, et al. Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet. 2007;3(7):1200–10.
9. Thorleifsson G, Magnusson KP, Sulem P, Walters GB, Gudbjartsson DF, Stefansson H, et al. Common sequence variants in the LOXL1 gene confer susceptibility to exfoliation glaucoma. Science (80-). 2007;317(5843):1397–400. doi: 10.1126/science.1146554 17690259
10. Raelson J V., Little RD, Ruether A, Fournier H, Paquin B, Van Eerdewegh P, et al. Genome-wide association study for Crohn’s disease in the Quebec Founder Population identifies multiple validated disease loci. Proc Natl Acad Sci. 2007;104(37):14747–52. doi: 10.1073/pnas.0706645104 17804789
11. Chen W-M, Erdos MR, Jackson AU, Saxena R, Sanna S, Silver KD, et al. Variations in the G6PC2/ABCB11 genomic region are associated with fasting glucose levels. J Clin Invest. 2008;118(7):2620–8. doi: 10.1172/JCI34566 18521185
12. Styrkarsdottir U, Halldorsson B V, Gretarsdottir S, Gudbjartsson DF, Walters GB, Ingvarsson T, et al. Multiple genetic loci for bone mineral density and fractures. N Engl J Med. 2008;358(22):2355–65. doi: 10.1056/NEJMoa0801197 18445777
13. Nakatsuka N, Moorjani P, Rai N, Sarkar B, Tandon A, Patterson N, et al. The promise of discovering population-specific disease-associated genes in South Asia. Nat Genet. 2017;49(9):1403–7. doi: 10.1038/ng.3917 28714977
14. Kaiser VB, Svinti V, Prendergast JG, Chau Y-Y, Campbell A, Patarcic I, et al. Homozygous loss-of-function variants in European cosmopolitan and isolate populations. Hum Mol Genet. 2015 Oct 1;24(19):5464–74. doi: 10.1093/hmg/ddv272 26173456
15. Jeroncic A, Memari Y, Ritchie GR, Hendricks AE, Kolb-Kokocinski A, Matchan A, et al. Whole-exome sequencing in an isolated population from the Dalmatian island of Vis. Eur J Hum Genet. 2016;24(10):1479–87. doi: 10.1038/ejhg.2016.23 27049301
16. Leblond CS, Cliquet F, Carton C, Huguet G, Mathieu A, Kergrohen T, et al. Both rare and common genetic variants contribute to autism in the Faroe Islands. npj Genomic Med. 2019;4(1).
17. Gusev A, Shah MJ, Kenny EE, Ramachandran A, Lowe JK, Salit J, et al. Low-pass genome-wide sequencing and variant inference using identity-by-descent in an isolated human population. Genetics. 2012;190(2):679–89. doi: 10.1534/genetics.111.134874 22135348
18. Walter K, Min JL, Huang J, Crooks L, Memari Y, McCarthy S, et al. The UK10K project identifies rare variants in health and disease. Nature. 2015;526(7571):82–9. doi: 10.1038/nature14962 26367797
19. Xue Y, Mezzavilla M, Haber M, McCarthy S, Chen Y, Narasimhan V, et al. Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated European populations. Nat Commun. 2017;8.
20. Southam L, Gilly A, Süveges D, Farmaki AE, Schwartzentruber J, Tachmazidou I, et al. Whole genome sequencing and imputation in isolated populations identify genetic associations with medically-relevant complex traits. Nat Commun. 2017;8.
21. Chheda H, Palta P, Pirinen M, McCarthy S, Walter K, Koskinen S, et al. Whole-genome view of the consequences of a population bottleneck using 2926 genome sequences from Finland and United Kingdom. Eur J Hum Genet. 2017;25(4):477–84. doi: 10.1038/ejhg.2016.205 28145424
22. Gudbjartsson DF, Helgason H, Gudjonsson SA, Zink F, Oddson A, Gylfason A, et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet. 2015;47(5):435–44. doi: 10.1038/ng.3247 25807286
23. Gilly A, Suveges D, Kuchenbaecker K, Pollard M, Southam L, Hatzikotoulas K, et al. Cohort-wide deep whole genome sequencing and the allelic architecture of complex traits. Nat Commun. 2018;9(1):4674. doi: 10.1038/s41467-018-07070-8 30405126
24. Mooney JA, Huber CD, Service S, Sul JH, Marsden CD, Zhang Z, et al. Understanding the Hidden Complexity of Latin American Population Isolates. Am J Hum Genet. 2018;103(5):707–26. doi: 10.1016/j.ajhg.2018.09.013 30401458
25. Zuk O, Schaffner SF, Samocha K, Do R, Hechter E, Kathiresan S, et al. Searching for missing heritability: Designing rare variant association studies. Proc Natl Acad Sci. 2014;111(4):E455–64. doi: 10.1073/pnas.1322563111 24443550
26. Wainschtein P, Jain DP, Yengo L, Zheng Z, TOPMed Anthropometry Working Group, Trans-Omics for Precision Medicine Consortium, et al. Recovery of trait heritability from whole genome sequence data. bioRxiv. 2019;
27. Davies N. The isles: a history. Macmillan; 1999. 1296 p.
28. Capelli C, Redhead N, Abernethy JK, Gratrix F, Wilson JF, Moen T, et al. A Y chromosome census of the British Isles. Curr Biol. 2003;13(11):979–84. doi: 10.1016/s0960-9822(03)00373-7 12781138
29. Wilson JF, Weiss DA, Richards M, Thomas MG, Bradman N, Goldstein DB. Genetic evidence for different male and female roles during cultural transitions in the British Isles. Proc Natl Acad Sci. 2001;98(9):5078–83. doi: 10.1073/pnas.071036898 11287634
30. Goodacre S, Helgason A, Nicholson J, Southam L, Ferguson L, Hickey E, et al. Genetic evidence for a family-based Scandinavian settlement of Shetland and Orkney during the Viking periods. Heredity (Edinb). 2005;95(2):129–35. doi: 10.1038/sj.hdy.6800661 15815712
31. Vitart V, Carothers AD, Hayward C, Teague P, Hastie ND, Campbell H, et al. Increased Level of Linkage Disequilibrium in Rural Compared with Urban Communities: A Factor to Consider in Association-Study Design. Am J Hum Genet. 2005;76(5):763–72. doi: 10.1086/429840 15791542
32. Gilbert E, O’Reilly S, Merrigan M, McGettigan D, Vitart V, Joshi PK, et al. The genetic landscape of Scotland and the Isles. PNAS. 2019;116(38):19064–19070. doi: 10.1073/pnas.1904761116 31481615
33. VIKING Project [Internet]. [cited 2019 Aug 1]. Available from: https://www.ed.ac.uk/viking/
34. Glodzik D, Navarro P, Vitart V, Hayward C, Mcquillan R, Wild SH, et al. Inference of identity by descent in population isolates and optimal sequencing studies. Eur J Hum Genet. 2013;21(10):1140–5. doi: 10.1038/ejhg.2012.307 23361219
35. Taylor AM, Pattie A, Deary IJ. Cohort Profile Update: The Lothian Birth Cohorts of 1921 and 1936. Int J Epidemiol. 2018;47(4):1042–1042r. doi: 10.1093/ije/dyy022 29546429
36. Deary IJ, Gow AJ, Pattie A, Starr JM. Cohort profile: The lothian birth cohorts of 1921 and 1936. Int J Epidemiol. 2012;41(6):1576–84. doi: 10.1093/ije/dyr197 22253310
37. LBC Project [Internet]. [cited 2019 Aug 1]. Available from: https://www.lothianbirthcohort.ed.ac.uk/
38. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. doi: 10.1101/gr.107524.110 20644199
39. Lek M, Karczewski KJ, Minikel E V, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285–91. doi: 10.1038/nature19057 27535533
40. Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, et al. Ensembl 2018. Nucleic Acids Res. 2018;46(D1):D754–61. doi: 10.1093/nar/gkx1098 29155950
41. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473(7345):43–9. doi: 10.1038/nature09906 21441907
42. Mayr E. Systematics and the Origin of Species from the Viewpoint of a Zoologist. Harvard University Press; 1999. 372 p.
43. Wang SR, Agarwala V, Flannick J, Chiang CWK, Altshuler D, Hirschhorn JN. Simulation of finnish population history, guided by empirical genetic data, to assess power of rare-variant tests in Finland. Am J Hum Genet. 2014;94(5):710–20. doi: 10.1016/j.ajhg.2014.03.019 24768551
44. Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123(3):585–95. 2513255
45. Neininger K, Marschall T, Helms V. SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome. PLoS One. 2019;14(4):e0214816. doi: 10.1371/journal.pone.0214816 30978217
46. Pemberton TJ, Absher D, Feldman MW, Myers RM, Rosenberg NA, Li JZ. Genomic patterns of homozygosity in worldwide human populations. Am J Hum Genet. 2012;91(2):275–92. doi: 10.1016/j.ajhg.2012.06.014 22883143
47. Szpiech ZA, Xu J, Pemberton TJ, Peng W, Zöllner S, Rosenberg NA, et al. Long runs of homozygosity are enriched for deleterious variation. Am J Hum Genet. 2013;93(1):90–102. doi: 10.1016/j.ajhg.2013.05.003 23746547
48. McQuillan R, Leutenegger AL, Abdel-Rahman R, Franklin CS, Pericic M, Barac-Lauc L, et al. Runs of Homozygosity in European Populations. Am J Hum Genet. 2008;83(3):359–72. doi: 10.1016/j.ajhg.2008.08.007 18760389
49. Ceballos FC, Joshi PK, Clark DW, Ramsay M, Wilson JF. Runs of homozygosity: Windows into population history and trait architecture. Nat Rev Genet. 2018;19(4):220–34. doi: 10.1038/nrg.2017.109 29335644
50. Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: Predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47(D1):D886–94. doi: 10.1093/nar/gky1016 30371827
51. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019;
52. Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: Improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018;46(Database issue):D1062–D1067. doi: 10.1093/nar/gkx1153 29165669
53. GWAS Catalog [Internet]. [cited 2018 Dec 1]. Available from: https://www.ebi.ac.uk/gwas/
54. Carithers L, Ardlie K, Barcus M, Branton P, Britton A, Buia S, et al. A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project. Biopreserv Biobank. 2015;13(5):311–9. doi: 10.1089/bio.2015.0032 26484571
55. GTEx (v7) [Internet]. [cited 2018 Dec 1]. Available from: https://gtexportal.org/home/
56. Pedersen CET, Lohmueller KE, Grarup N, Bjerregaard P, Hansen T, Siegismund HR, et al. The effect of an extreme and prolonged population bottleneck on patterns of deleterious variation: Insights from the Greenlandic Inuit. Genetics. 2017;205(2):787–801. doi: 10.1534/genetics.116.193821 27903613
57. Margaryan A, Lawson DJ, Sikora M, Racimo F, Rasmussen S, Moltke I, et al. Population genomics of the Viking world. bioRxiv. 2019;
58. Sudmant PH, Mallick S, Nelson BJ, Hormozdiari F, Krumm N, Huddleston J, et al. Global diversity, population stratification, and selection of human copy-number variation. Science (80-). 2015;349(6253):aab3761. doi: 10.1126/science.aab3761 26249230
59. Taylor MS, Kai C, Kawai J, Carninci P, Hayashizaki Y, Semple CAM. Heterotachy in mammalian promoter evolution. PLoS Genet. 2006;2(4):627–39.
60. Young RS, Hayashizaki Y, Andersson R, Sandelin A, Kawaji H, Itoh M, et al. The frequent evolutionary birth and death of functional promoters in mouse and human. Genome Res. 2015;25(10):1546–57. doi: 10.1101/gr.190546.115 26228054
61. Kindt ASD, Navarro P, Semple CAM, Haley CS. The genomic signature of trait-associated variants. BMC Genomics. 2013;14:108. doi: 10.1186/1471-2164-14-108 23418889
62. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics. 2009;25(5):1754–60.
63. Faust GG, Hall IM. SAMBLASTER: Fast duplicate marking and structural variant read extraction. Bioinformatics. 2014;30(17):2503–5. doi: 10.1093/bioinformatics/btu314 24812344
64. Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants. Bioinformatics. 2015;31(13):2202–4. doi: 10.1093/bioinformatics/btv112 25701572
65. GATK Hard Filtering [Internet]. [cited 2017 Jun 1]. Available from: https://software.broadinstitute.org/gatk/documentation/article.php?id=3225
66. CRg dataset (36mers) [Internet]. [cited 2017 Jul 1]. Available from: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeMapability/wgEncodeCrgMapabilityAlign36mer.bigWig
67. Duke dataset (35mers) [Internet]. [cited 2017 Jul 1]. Available from: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeMapability/wgEncodeDukeMapabilityUniqueness35bp.bigWig
68. DAC dataset [Internet]. [cited 2017 Jul 1]. Available from: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeMapability/wgEncodeDacMapabilityConsensusExcludable.bed.gz
69. Purcell SM, Chang CC, Chow CC, Tellier LC, Lee JJ, Vattikuti S. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4(7).
70. Staples J, Qiao D, Cho MH, Silverman EK, Nickerson DA, Below JE. PRIMUS: Rapid reconstruction of pedigrees from genome-wide estimates of identity by descent. Am J Hum Genet. 2014;95(5):553–64. doi: 10.1016/j.ajhg.2014.10.005 25439724
71. Auton A, Abecasis GR, Altshuler DM, Durbin RM, Bentley DR, Chakravarti A, et al. A global reference for human genetic variation. Nature. 2015 Oct 30;526(7571):68–74. doi: 10.1038/nature15393 26432245
72. McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The Ensembl Variant Effect Predictor. Genome Biol. 2016;17(1):122. doi: 10.1186/s13059-016-0974-4 27268795
73. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–1664. doi: 10.1101/gr.094052.109 19648217
74. ADMIXTURE tool [Internet]. [cited 2019 Aug 1]. Available from: http://software.genetics.ucla.edu/admixture/index.html
75. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–2158. doi: 10.1093/bioinformatics/btr330 21653522
76. Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987–93. doi: 10.1093/bioinformatics/btr509 21903627
77. 15 chromatin states data tracks [Internet]. [cited 2018 Nov 1]. Available from: http://genome.ucsc.edu/cgi-bin/hgFileUi?g=wgEncodeBroadHmm&db=hg19
78. pLI and z-score file [Internet]. [cited 2017 Oct 1]. Available from: ftp://ftp.broadinstitute.org/pub/ExAC_release/release0.3.1/functional_gene_constraint
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2019 Číslo 11
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- The genetic architecture of helminth-specific immune responses in a wild population of Soay sheep (Ovis aries)
- A circadian output center controlling feeding:Fasting rhythms in Drosophila
- AMPK regulates ESCRT-dependent microautophagy of proteasomes concomitant with proteasome storage granule assembly during glucose starvation
- Chromatin dynamics enable transcriptional rhythms in the cnidarian Nematostella vectensis