Distinguishing between Selective Sweeps from Standing Variation and from a Mutation
An outstanding question in human genetics has been the degree to which adaptation occurs from standing genetic variation or from de novo mutations. Here, we combine several common statistics used to detect selection in an Approximate Bayesian Computation (ABC) framework, with the goal of discriminating between models of selection and providing estimates of the age of selected alleles and the selection coefficients acting on them. We use simulations to assess the power and accuracy of our method and apply it to seven of the strongest sweeps currently known in humans. We identify two genes, ASPM and PSCA, that are most likely affected by selection on standing variation; and we find three genes, ADH1B, LCT, and EDAR, in which the adaptive alleles seem to have swept from a new mutation. We also confirm evidence of selection for one further gene, TRPV6. In one gene, G6PD, neither neutral models nor models of selective sweeps fit the data, presumably because this locus has been subject to balancing selection.
Vyšlo v časopise:
Distinguishing between Selective Sweeps from Standing Variation and from a Mutation. PLoS Genet 8(10): e32767. doi:10.1371/journal.pgen.1003011
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1003011
Souhrn
An outstanding question in human genetics has been the degree to which adaptation occurs from standing genetic variation or from de novo mutations. Here, we combine several common statistics used to detect selection in an Approximate Bayesian Computation (ABC) framework, with the goal of discriminating between models of selection and providing estimates of the age of selected alleles and the selection coefficients acting on them. We use simulations to assess the power and accuracy of our method and apply it to seven of the strongest sweeps currently known in humans. We identify two genes, ASPM and PSCA, that are most likely affected by selection on standing variation; and we find three genes, ADH1B, LCT, and EDAR, in which the adaptive alleles seem to have swept from a new mutation. We also confirm evidence of selection for one further gene, TRPV6. In one gene, G6PD, neither neutral models nor models of selective sweeps fit the data, presumably because this locus has been subject to balancing selection.
Zdroje
1. KimuraM (1985) The Neutral Theory of Molecular Evolution. Cambridge University Press 388.
2. OhtaT (1992) The nearly neutral theory of molecular evolution. Annual Review of Ecology and Systematics 23: 263–286.
3. HurstLD (2009) Genetics and the understanding of selection. Nat Rev Genet 10: 83–93 doi:10.1038/nrg2506
4. HernandezRD, KelleyJL, ElyashivE, MeltonSC, AutonA, et al. (2011) Classic Selective Sweeps Were Rare in Recent Human Evolution. Science 331: 920–924 doi:10.1126/science.1198878
5. HermissonJ, PenningsPS (2005) Soft Sweeps. Genetics 169: 2335–2352 doi:10.1534/genetics.104.036947
6. PenningsPS, HermissonJ (2006) Soft Sweeps II—Molecular Population Genetics of Adaptation from Recurrent Mutation or Migration. Molecular Biology and Evolution 23: 1076–1084 doi:10.1093/molbev/msj117
7. PenningsPS, HermissonJ (2006) Soft sweeps III: the signature of positive selection from recurrent mutation. PLoS Genet 2: e186 doi:10.1371/journal.pgen.0020186
8. InnanH, KimY (2004) Pattern of polymorphism after strong artificial selection in a domestication event. Proceedings of the National Academy of Sciences of the United States of America 101: 10667–10672 doi:10.1073/pnas.0401720101
9. SabetiPC, ReichDE, HigginsJM, LevineHZP, RichterDJ, et al. (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419: 832–837.
10. AkeyJM, EberleMA, RiederMJ, CarlsonCS, ShriverMD, et al. (2004) Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol 2: e286 doi:10.1371/journal.pbio.0020286
11. WilliamsonSH, HernandezR, Fledel-AlonA, ZhuL, NielsenR, et al. (2005) Simultaneous inference of selection and population growth from patterns of variation in the human genome. PNAS 102: 7882–7887.
12. SabetiPC, SchaffnerSF, FryB, LohmuellerJ, VarillyP, et al. (2006) Positive Natural Selection in the Human Lineage. Science 312: 1614–1620 doi:10.1126/science.1124309
13. BustamanteCD, Fledel-AlonA, WilliamsonS, NielsenR, Todd HubiszM, et al. (2005) Natural selection on protein-coding genes in the human genome. Nature 437: 1153–1157 doi:10.1038/nature04240
14. NielsenR (2005) Molecular signatures of natural selection. Annual Review of Genetics 39: 197–218.
15. HudsonRR, KreitmanM, AguadéM (1987) A Test of Neutral Molecular Evolution Based on Nucleotide Data. Genetics 116: 153–159.
16. McDonaldJH, KreitmanM (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature 351: 652–654 doi:10.1038/351652a0
17. LewontinRC, KrakauerJ (1973) Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74: 175–195.
18. SabetiPC, VarillyP, FryB, LohmuellerJ, HostetterE, et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449: 913–918 doi:10.1038/nature06250
19. VoightBF, KudaravalliS, WenX, PritchardJK (2006) A map of recent positive selection in the human genome. PLoS Biol 4: e72 doi:10.1371/journal.pbio.0040072
20. TajimaF (1989) Statistical Method for Testing the Neutral Mutation Hypothesis by DNA Polymorphism. Genetics 123: 585–595.
21. FuYX (1997) Statistical Tests of Neutrality of Mutations Against Population Growth, Hitchhiking and Background Selection. Genetics 147: 915–925.
22. AchazG (2009) Frequency Spectrum Neutrality Tests: One for All and All for One. Genetics 183: 249–258 doi:10.1534/genetics.109.104042
23. FayJC, WuC-I (2000) Hitchhiking Under Positive Darwinian Selection. Genetics 155: 1405–1413.
24. BarrettRDH, SchluterD (2008) Adaptation from standing genetic variation. Trends in Ecology & Evolution 23: 38–44 doi:16/j.tree.2007.09.008
25. EwensWJ (2004) Mathematical Population Genetics: Theoretical introduction. Springer 448.
26. HudsonRR, KaplanNL (1988) The Coalescent Process in Models With Selection and Recombination. Genetics 120: 831–840.
27. KaplanNL, HudsonRR, LangleyCH (1989) The “hitchhiking Effect” Revisited. Genetics 123: 887–899.
28. SpencerCCA, CoopG (2004) SelSim: A Program to Simulate Population Genetic Data with Natural Selection and Recombination. Bioinformatics 20: 3673–3675 doi:10.1093/bioinformatics/bth417
29. PrzeworskiM, CoopG, WallJD (2005) The signature of positive selection on standing genetic variation. Evolution 59: 2312–2323.
30. GrossmanSR, ShylakhterI, KarlssonEK, ByrneEH, MoralesS, et al. (2010) A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection. Science 327: 883–886 doi:10.1126/science.1183863
31. TavaréS, BaldingDJ, GriffithsRC, DonnellyP (1997) Inferring coalescence times from DNA sequence data. Genetics 145: 505–518.
32. BeaumontMA, ZhangW, BaldingDJ (2002) Approximate Bayesian computation in population genetics. Genetics 162: 2025–2035.
33. FagundesNJR, RayN, BeaumontMA, NeuenschwanderS, SalzanoFM, et al. (2007) Statistical evaluation of alternative models of human evolution. PNAS 104: 17614–17619.
34. PeterBM, WegmannD, ExcoffierL (2010) Distinguishing between population bottleneck and population subdivision by a Bayesian model choice procedure. Molecular Ecology no–no doi:10.1111/j.1365-294X.2010.04783.x
35. WegmannD, ExcoffierL (2010) Bayesian Inference of the Demographic History of Chimpanzees. Molecular Biology and Evolution 27: 1425–1435 doi:10.1093/molbev/msq028
36. CsilléryK, BlumMGB, GaggiottiOE, FrançoisO (2010) Approximate Bayesian Computation (ABC) in practice. Trends Ecol Evol (Amst) 25: 410–418 doi:10.1016/j.tree.2010.04.001
37. MillerN, EstoupA, ToepferS, BourguetD, LapchinL, et al. (2005) Multiple Transatlantic Introductions of the Western Corn Rootworm. Science 310: 992 doi:10.1126/science.1115871
38. CornuetJ-M, SantosF, BeaumontMA, RobertCP, MarinJ-M, et al. (2008) Inferring population history with DIY ABC: a user-friendly approach to approximate Bayesian computation. Bioinformatics 24: 2713–2719 doi:10.1093/bioinformatics/btn514
39. JobinMJ, MountainJL (2008) REJECTOR: software for population history inference from genetic data via a rejection algorithm. Bioinformatics 24: 2936–2937.
40. WegmannD, LeuenbergerC, NeuenschwanderS, ExcoffierL (2010) ABCtoolbox: a versatile toolkit for approximate Bayesian computations. BMC Bioinformatics 11: 116 doi:10.1186/1471-2105-11-116
41. MarjoramP, MolitorJ, PlagnolV, TavaréS (2003) Markov chain Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences 100: 15324–15328 doi:10.1073/pnas.0306899100
42. SissonSA, FanY, TanakaMM (2007) Sequential Monte Carlo without likelihoods. Proceedings of the National Academy of Sciences 104: 1760–1765 doi:10.1073/pnas.0607208104
43. BlumMGB, FrançoisO (2009) Non-linear regression models for Approximate Bayesian Computation. Stat Comput 20: 63–73 doi:10.1007/s11222-009-9116-0
44. SimonsenKL, ChurchillGA, AquadroCF (1995) Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141: 413.
45. TeshimaKM, CoopG, PrzeworskiM (2006) How reliable are empirical genomic scans for selective sweeps? Genome Research 16: 702–712 doi:10.1101/gr.5105206
46. The 1000 Genomes Project Consortium (2010) _A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073 doi:10.1038/nature09534
47. LiH (2011) Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27: 718–719 doi:10.1093/bioinformatics/btq671
48. LiH, MukherjeeN, SoundararajanU, TarnokZ, BartaC, et al. (2007) Geographically separate increases in the frequency of the derived ADH1B*47His allele in eastern and western Asia. Am J Hum Genet 81: 842–846 doi:10.1086/521201
49. PengY, ShiH, QiX, XiaoC, ZhongH, et al. (2010) The ADH1B Arg47His polymorphism in East Asian populations and expansion of rice domestication in history. BMC Evolutionary Biology 10: 15 doi:10.1186/1471-2148-10-15
50. OsierMV, PakstisAJ, SoodyallH, ComasD, GoldmanD, et al. (2002) A Global Perspective on Genetic Variation at the ADH Genes Reveals Unusual Patterns of Linkage Disequilibrium and Diversity. The American Journal of Human Genetics 71: 84–99 doi:10.1086/341290
51. EngMY, LuczakSE, WallTL (2007) ALDH2, ADH1B, and ADH1C genotypes in Asians: a literature review. Alcohol Res Health 30: 22–27.
52. McGovernPE, ZhangJ, TangJ, ZhangZ, HallGR, et al. (2004) Fermented beverages of pre- and proto-historic China. Proceedings of the National Academy of Sciences of the United States of America 101: 17593–17598 doi:10.1073/pnas.0407921102
53. BondJ, RobertsE, MochidaGH, HampshireDJ, ScottS, et al. (2002) ASPM is a major determinant of cerebral cortical size. Nat Genet 32: 316–320 doi:10.1038/ng995
54. ZhangJ (2003) Evolution of the Human ASPM Gene, a Major Determinant of Brain Size. Genetics 165: 2063–2070.
55. KouprinaN, PavlicekA, MochidaGH, SolomonG, GerschW, et al. (2004) Accelerated Evolution of the ASPM Gene Controlling Brain Size Begins Prior to Human Brain Expansion. PLoS Biol 2: e126 doi:10.1371/journal.pbio.0020126
56. Mekel-BobrovN, GilbertSL, EvansPD, VallenderEJ, AndersonJR, et al. (2005) Ongoing Adaptive Evolution of ASPM, a Brain Size Determinant in Homo sapiens. Science 309: 1720–1722 doi:10.1126/science.1116815
57. Mekel-BobrovN, LahnBT (2007) Response to Comments by Timpson et al. and Yu et al. Science 317: 1036 doi:10.1126/science.1143658
58. CurratM, ExcoffierL, MaddisonW, OttoSP, RayN, et al. (2006) Comment on “Ongoing Adaptive Evolution of ASPM, a Brain Size Determinant in Homo sapiens” and “Microcephalin, a Gene Regulating Brain Size, Continues to Evolve Adaptively in Humans.”. Science 313: 172 doi:10.1126/science.1122712
59. TimpsonN, HeronJ, SmithGD, EnardW (2007) Comment on Papers by Evans et al. and Mekel-Bobrov et al. on Evidence for Positive Selection of MCPH1 and ASPM. Science 317: 1036 doi:10.1126/science.1141705
60. YuF, HillRS, SchaffnerSF, SabetiPC, WangET, et al. (2007) Comment on “Ongoing Adaptive Evolution of ASPM, a Brain Size Determinant in Homo sapiens.”. Science 316: 370 doi:10.1126/science.1137568
61. BrykJ, HardouinE, PugachI, HughesD, StrotmannR, et al. (2008) Positive Selection in East Asians for an EDAR Allele that Enhances NF-κB Activation. PLoS ONE 3: e2209 doi:10.1371/journal.pone.0002209
62. FujimotoA, OhashiJ, NishidaN, MiyagawaT, MorishitaY, et al. (2008) A replication study confirmed the EDAR gene to be a major contributor to population differentiation regarding head hair thickness in Asia. Hum Genet 124: 179–185 doi:10.1007/s00439-008-0537-1
63. KimuraR, YamaguchiT, TakedaM, KondoO, TomaT, et al. (2009) A Common Variation in EDAR Is a Genetic Determinant of Shovel-Shaped Incisors. Am J Hum Genet 85: 528–535 doi:10.1016/j.ajhg.2009.09.006
64. WatersMR, StaffordTW (2007) Redefining the Age of Clovis: Implications for the Peopling of the Americas. Science 315: 1122–1126 doi:10.1126/science.1137166
65. The International HapMap 3 Consortium (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467: 52.
66. TishkoffSA, VarkonyiR, CahinhinanN, AbbesS, ArgyropoulosG, et al. (2001) Haplotype Diversity and Linkage Disequilibrium at Human G6PD: Recent Origin of Alleles That Confer Malarial Resistance. Science 293: 455–462 doi:10.1126/science.1061573
67. VerrelliBC, McDonaldJH, ArgyropoulosG, Destro-BisolG, FromentA, et al. (2002) Evidence for balancing selection from nucleotide sequence analyses of human G6PD. Am J Hum Genet 71: 1112–1128 doi:10.1086/344345
68. SaundersMA, HammerMF, NachmanMW (2002) Nucleotide variability at G6pd and the signature of malarial selection in humans. Genetics 162: 1849–1861.
69. CarsonPE, FlanaganCL, IckesCE, AlvingAS (1956) Enzymatic Deficiency in Primaquine-Sensitive Erythrocytes. Science 124: 484–485 doi:10.1126/science.124.3220.484-a
70. BeutlerE (1994) G6PD deficiency. Blood 84: 3613–3636.
71. NkhomaET, PooleC, VannappagariV, HallSA, BeutlerE (2009) The global prevalence of glucose-6-phosphate dehydrogenase deficiency: A systematic review and meta-analysis. Blood Cells, Molecules, and Diseases 42: 267–278 doi:10.1016/j.bcmd.2008.12.005
72. RuwendeC, KhooSC, SnowRW, YatesSNR, KwiatkowskiD, et al. (1995) Natural selection of hemi- and heterozygotes for G6PD deficiency in Africa by resistance to severe malaria. Nature 376: 246–249 doi:10.1038/376246a0
73. HolloxEJ, PoulterM, ZvarikM, FerakV, KrauseA, et al. (2001) Lactase haplotype diversity in the Old World. Am J Hum Genet 68: 160–172 doi:10.1086/316924
74. EnattahNS, SahiT, SavilahtiE, TerwilligerJD, PeltonenL, et al. (2002) Identification of a variant associated with adult-type hypolactasia. Nat Genet 30: 233–237 doi:10.1038/ng826
75. BersaglieriT, SabetiPC, PattersonN, VanderploegT, SchaffnerSF, et al. (2004) Genetic Signatures of Strong Recent Positive Selection at the Lactase Gene. Am J Hum Genet 74: 1111–1120.
76. TishkoffSA, ReedFA, RanciaroA, VoightBF, BabbittCC, et al. (2007) Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet 39: 31–40 doi:10.1038/ng1946
77. KuokkanenM, EnattahNS, OksanenA, SavilahtiE, OrpanaA, et al. (2003) Transcriptional regulation of the lactase-phlorizin hydrolase gene by polymorphisms associated with adult-type hypolactasia. Gut 52: 647–652.
78. OldsLC, SibleyE (2003) Lactase persistence DNA variant enhances lactase promoter activity in vitro: functional role as a cis regulatory element. Hum Mol Genet 12: 2333–2340 doi:10.1093/hmg/ddg244
79. TroelsenJT, OlsenJ, MøllerJ, SjöströmH (2003) An upstream polymorphism associated with lactase persistence has increased enhancer activity. Gastroenterology 125: 1686–1694.
80. ItanY, PowellA, BeaumontMA, BurgerJ, ThomasMG (2009) The Origins of Lactase Persistence in Europe. PLoS Comput Biol 5: e1000491 doi:10.1371/journal.pcbi.1000491
81. BurgerJ, KirchnerM, BramantiB, HaakW, ThomasMG (2007) Absence of the Lactase-Persistence-Associated Allele in Early Neolithic Europeans. PNAS 104: 3736–3741 doi:10.1073/pnas.0607187104
82. MalmströmH, LinderholmA, LidénK, StoråJ, MolnarP, et al. (2010) High frequency of lactose intolerance in a prehistoric hunter-gatherer population in northern Europe. BMC Evolutionary Biology 10: 89 doi:10.1186/1471-2148-10-89
83. Plantinga TS, Alonso S, Izagirre N, Hervella M, Fregel R, et al.. (2012) Low prevalence of lactase persistence in Neolithic South-West Europe. European Journal of Human Genetics. Available:http://www.nature.com/ejhg/journal/vaop/ncurrent/abs/ejhg2011254a.html. Accessed 6 April 2012.
84. KimuraM, OhtaT (1973) The Age of a neutral mutant persisting in a finite population. Genetics 75: 199–212.
85. MaruyamaT (1974) The Age of an Allele in a Finite Population. Genetics Research 23: 137–143 doi:10.1017/S0016672300014750
86. BhatiaG, PattersonN, PasaniucB, ZaitlenN, GenoveseG, et al. (2011) Genome-wide Comparison of African-Ancestry Populations from CARe and Other Cohorts Reveals Signals of Natural Selection. The American Journal of Human Genetics 89: 368–381 doi:10.1016/j.ajhg.2011.07.025
87. Genetic variation in PSCA is associated with susceptibility to diffuse-type gastric cancer (2008) Nat Genet 40: 730–740 doi:10.1038/ng.152
88. WuX, YeY, KiemeneyLA, SulemP, RafnarT, et al. (2009) Genetic variation in the prostate stem cell antigen gene PSCA confers susceptibility to urinary bladder cancer. Nat Genet 41: 991–995 doi:10.1038/ng.421
89. StajichJE, HahnMW (2005) Disentangling the Effects of Demography and Selection in Human History. Molecular Biology and Evolution 22: 63–73 doi:10.1093/molbev/msh252
90. AkeyJM, SwansonWJ, MadeoyJ, EberleM, ShriverMD (2006) TRPV6 exhibits unusual patterns of polymorphism and divergence in worldwide populations. Human Molecular Genetics 15: 2106–2113 doi:10.1093/hmg/ddl134
91. BirnbaumerL, YidirimE, AbramowitzJ (2003) A comparison of the genes coding for canonical TRP channels and their M, V and P relatives. Cell Calcium 33: 419–432 doi:10.1016/S0143-4160(03)00068-X
92. Marin J-M, Pillai N, Robert CP, Rousseau J (2011) Relevant statistics for Bayesian model choice. arXiv:11104700. Available:http://arxiv.org/abs/1110.4700. Accessed 24 July 2012.
93. Didelot X, Everitt RG, Johansen AM, Lawson DJ (2010) Likelihood-free estimation of model evidence.
94. Robert CP, Cornuet J-M, Marin J-M, Pillai N (2011) Lack of confidence in ABC model choice. 11024432. Available:http://arxiv.org/abs/1102.4432. Accessed 25 August 2011.
95. CookSR, GelmanA, RubinDB (2006) Validation of software for Bayesian models using posterior quantiles. Journal of Computational and Graphical Statistics 15: 675–692.
96. LeuenbergerC, WegmannD (2010) Bayesian Computation and Model Selection Without Likelihoods. Genetics 184: 243–252 doi:10.1534/genetics.109.109058
97. WegmannD, LeuenbergerC, ExcoffierL (2009) Efficient Approximate Bayesian Computation Coupled With Markov Chain Monte Carlo Without Likelihood. Genetics 182: 1207–1218 doi:10.1534/genetics.109.102509
98. BoxG, CoxD (1964) An analysis of transformations. JR Stat Soc, Ser B 26: 211–243.
99. TenenhausM (1998) La régression PLS: théorie et pratique. Editions TECHNIP 274.
100. Lê CaoK-A, GonzálezI, DéjeanS (2009) integrOmics: an R package to unravel relationships between two omics datasets. Bioinformatics 25: 2855–2856 doi:10.1093/bioinformatics/btp515
101. BoulesteixA-L, StrimmerK (2007) Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Brief Bioinformatics 8: 32–44 doi:10.1093/bib/bbl016
102. TeshimaKM, InnanH (2009) mbs: modifying Hudson's ms software to generate samples of DNA sequences with a biallelic site under selection. BMC bioinformatics 10: 166.
103. KimuraM (1964) Diffusion models in population genetics. Journal of Applied Probability 1: 177–232.
104. MyersS, BottoloL, FreemanC, McVeanG, DonnellyP (2005) A Fine-Scale Map of Recombination Rates and Hotspots Across the Human Genome. Science 310: 321–324 doi:10.1126/science.1117196
105. Li H, Durbin R (2011) Inference of human population history from individual whole-genome sequences. Nature. Available:http://www.nature.com/nature/journal/vaop/ncurrent/full/nature10231.html. Accessed 5 April 2012.
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2012 Číslo 10
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- A Mutation in the Gene Causes Alternative Splicing Defects and Deafness in the Bronx Waltzer Mouse
- Mutations in (Hhat) Perturb Hedgehog Signaling, Resulting in Severe Acrania-Holoprosencephaly-Agnathia Craniofacial Defects
- Classical Genetics Meets Next-Generation Sequencing: Uncovering a Genome-Wide Recombination Map in
- Regulation of ATG4B Stability by RNF5 Limits Basal Levels of Autophagy and Influences Susceptibility to Bacterial Infection