Natural Selection Affects Multiple Aspects of Genetic Variation at Putatively Neutral Sites across the Human Genome
A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.
Vyšlo v časopise:
Natural Selection Affects Multiple Aspects of Genetic Variation at Putatively Neutral Sites across the Human Genome. PLoS Genet 7(10): e32767. doi:10.1371/journal.pgen.1002326
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002326
Souhrn
A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.
Zdroje
1. AkeyJMZhangGZhangKJinLShriverMD 2002 Interrogating a high-density SNP map for signatures of natural selection. Genome Res 12 1805 1814
2. PayseurBACutterADNachmanMW 2002 Searching for evidence of positive selection in the human genome using patterns of microsatellite variability. Mol Biol Evol 19 1143 1153
3. AkeyJMEberleMARiederMJCarlsonCSShriverMD 2004 Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol 2 e286 doi:10.1371/journal.pbio.0020286
4. StorzJFPayseurBANachmanMW 2004 Genome scans of DNA variability in humans reveal evidence for selective sweeps outside of Africa. Mol Biol Evol 21 1800 1811
5. CarlsonCSThomasDJEberleMASwansonJELivingstonRJ 2005 Genomic regions exhibiting positive selection identified from dense genotype data. Genome Res 15 1553 1565
6. StajichJEHahnMW 2005 Disentangling the effects of demography and selection in human history. Mol Biol Evol 22 63 73
7. KelleyJLMadeoyJCalhounJCSwansonWAkeyJM 2006 Genomic signatures of positive selection in humans and the limits of outlier approaches. Genome Res 16 980 989
8. VoightBFKudaravalliSWenXPritchardJK 2006 A map of recent positive selection in the human genome. PLoS Biol 4 e72 doi:10.1371/journal.pbio.0040072
9. WangETKodamaGBaldiPMoyzisRK 2006 Global landscape of recent inferred Darwinian selection for Homo sapiens. Proc Natl Acad Sci U S A 103 135 140
10. HawksJWangETCochranGMHarpendingHCMoyzisRK 2007 Recent acceleration of human adaptive evolution. Proc Natl Acad Sci U S A 104 20753 20758
11. NielsenRHellmannIHubiszMBustamanteCClarkAG 2007 Recent and ongoing selection in the human genome. Nat Rev Genet 8 857 868
12. SabetiPCVarillyPFryBLohmuellerJHostetterE 2007 Genome-wide detection and characterization of positive selection in human populations. Nature 449 913 918
13. TangKThorntonKRStonekingM 2007 A new approach for using genome scans to detect recent positive selection in the human genome. PLoS Biol 5 e171 doi:10.1371/journal.pbio.0050171
14. WilliamsonSHHubiszMJClarkAGPayseurBABustamanteCD 2007 Localizing recent adaptive evolution in the human genome. PLoS Genet 3 e90 doi:10.1371/journal.pgen.0030090
15. KelleyJLSwansonWJ 2008 Positive selection in the human genome: From genome scans to biological significance. Annu Rev Genomics Hum Genet 9 143 160
16. AkeyJM 2009 Constructing genomic maps of positive selection in humans: Where do we go from here? Genome Res 19 711 722
17. NielsenRHubiszMJHellmannITorgersonDAndresAM 2009 Darwinian and demographic forces affecting human protein coding genes. Genome Res 19 838 849
18. PickrellJKCoopGNovembreJKudaravalliSLiJZ 2009 Signals of recent positive selection in a worldwide sample of human populations. Genome Res 19 826 837
19. GrossmanSRShylakhterIKarlssonEKByrneEHMoralesS 2010 A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327 883 886
20. CoopGPickrellJKNovembreJKudaravalliSLiJ 2009 The role of geography in human adaptation. PLoS Genet 5 e1000500 doi:10.1371/journal.pgen.1000500
21. HernandezRDKelleyJLElyashivEMeltonSCAutonA 2011 Classic selective sweeps were rare in recent human evolution. Science 331 920 924
22. BoykoARWilliamsonSHIndapARDegenhardtJDHernandezRD 2008 Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet 4 e1000083 doi:10.1371/journal.pgen.1000083
23. Eyre-WalkerAKeightleyPD 2009 Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol Biol Evol 26 2097 2108
24. WilliamsonSHHernandezRFledel-AlonAZhuLNielsenR 2005 Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc Natl Acad Sci U S A 102 7882 7887
25. DrakeJABirdCNemeshJThomasDJNewton-ChehC 2006 Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat Genet 38 223 227
26. Eyre-WalkerAWoolfitMPhelpsT 2006 The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173 891 900
27. AsthanaSNobleWSKryukovGGrantCESunyaevS 2007 Widely distributed noncoding purifying selection in the human genome. Proc Natl Acad Sci U S A 104 12410 12415
28. Eyre-WalkerAKeightleyPD 2007 The distribution of fitness effects of new mutations. Nat Rev Genet 8 610 618
29. KeightleyPDEyre-WalkerA 2007 Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies. Genetics 177 2251 2261
30. LohmuellerKEIndapARSchmidtSBoykoARHernandezRD 2008 Proportionally more deleterious genetic variation in European than in African populations. Nature 451 994 997
31. TorgersonDGBoykoARHernandezRDIndapAHuX 2009 Evolutionary processes acting on candidate cis-regulatory regions in humans inferred from patterns of polymorphism and divergence. PLoS Genet 5 e1000592 doi:10.1371/journal.pgen.1000592
32. KeightleyPDEyre-WalkerA 2010 What can we learn about the distribution of fitness effects of new mutations from DNA sequence data? Philos Trans R Soc Lond B Biol Sci 365 1187 1193
33. Manyard SmithJHaighJ 1974 The hitch-hiking effect of a favourable gene. Genet Res 23 23 35
34. CharlesworthBMorganMTCharlesworthD 1993 The effect of deleterious mutations on neutral molecular variation. Genetics 134 1289 1303
35. AguadeMMiyashitaNLangleyCH 1989 Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster. Genetics 122 607 615
36. BegunDJAquadroCF 1992 Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature 356 519 520
37. KaplanNLHudsonRRLangleyCH 1989 The “hitchhiking effect” revisited. Genetics 123 887 899
38. CharlesworthDCharlesworthBMorganMT 1995 The pattern of neutral molecular variation under the background selection model. Genetics 141 1619 1632
39. HudsonRRKaplanNL 1995 The coalescent process and background selection. Philos Trans R Soc Lond B Biol Sci 349 19 23
40. HudsonRRKaplanNL 1995 Deleterious background selection with recombination. Genetics 141 1605 1617
41. NordborgMCharlesworthBCharlesworthD 1996 The effect of recombination on background selection. Genet Res 67 159 174
42. NachmanMWBauerVLCrowellSLAquadroCF 1998 DNA variability and recombination rates at X-linked loci in humans. Genetics 150 1133 1141
43. NachmanMW 2001 Single nucleotide polymorphisms and recombination rate in humans. Trends Genet 17 481 485
44. HellmannIEbersbergerIPtakSEPaaboSPrzeworskiM 2003 A neutral explanation for the correlation of diversity with recombination rates in humans. Am J Hum Genet 72 1527 1535
45. PayseurBANachmanMW 2000 Microsatellite variation and recombination rate in the human genome. Genetics 156 1285 1298
46. HellmannIPruferKJiHZodyMCPaaboS 2005 Why do human diversity levels vary at a megabase scale? Genome Res 15 1222 1231
47. HellmannIMangYGuZLiPde la VegaFM 2008 Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals. Genome Res 18 1020 1029
48. CaiJJMacphersonJMSellaGPetrovDA 2009 Pervasive hitchhiking at coding and regulatory sites in humans. PLoS Genet 5 e1000336 doi:10.1371/journal.pgen.1000336
49. TajimaF 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123 585 595
50. BravermanJMHudsonRRKaplanNLLangleyCHStephanW 1995 The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140 783 796
51. SimonsenKLChurchillGAAquadroCF 1995 Properties of statistical tests of neutrality for DNA polymorphism data. Genetics 141 413 429
52. TachidaH 2000 Molecular evolution in a multisite nearly neutral mutation model. J Mol Evol 50 69 81
53. ComeronJMKreitmanM 2002 Population, evolutionary and genomic consequences of interference selection. Genetics 161 389 410
54. GordoINavarroACharlesworthB 2002 Muller's ratchet and the pattern of variation at a neutral locus. Genetics 161 835 848
55. ComeronJMWillifordAKlimanRM 2008 The Hill-Robertson effect: Evolutionary consequences of weak selection and linkage in finite populations. Heredity 100 19 31
56. KaiserVBCharlesworthB 2009 The effects of deleterious mutations on evolution in non-recombining genomes. Trends Genet 25 9 12
57. O'FallonBDSegerJAdlerFR 2010 A continuous-state coalescent and the impact of weak selection on the structure of gene genealogies. Mol Biol Evol 27 1162 1172
58. SegerJSmithWAPerryJJHunnJKaliszewskaZA 2010 Gene genealogies strongly distorted by weakly interfering mutations in constant environments. Genetics 184 529 545
59. SantiagoECaballeroA 1998 Effective size and polymorphism of linked neutral loci in populations under directional selection. Genetics 149 2105 2117
60. StephanWXingLKirbyDABravermanJM 1998 A test of the background selection hypothesis based on nucleotide data from Drosophila ananassae. Proc Natl Acad Sci U S A 95 5649 5654
61. LangleyCHLazzaroBPPhillipsWHeikkinenEBravermanJM 2000 Linkage disequilibria and the site frequency spectra in the su(s) and su(wa) regions of the Drosophila melanogaster X chromosome. Genetics 156 1837 1852
62. AndolfattoP 2001 Adaptive hitchhiking effects on genome variability. Curr Opin Genet Dev 11 635 641
63. AndolfattoPPrzeworskiM 2001 Regions of lower crossing over harbor more rare variants in African populations of Drosophila melanogaster. Genetics 158 657 665
64. PayseurBANachmanMW 2002 Natural selection at linked sites in humans. Gene 300 31 42
65. BravermanJMLazzaroBPAguadeMLangleyCH 2005 DNA sequence polymorphism and divergence at the erect wing and suppressor of sable loci of Drosophila melanogaster and D. simulans. Genetics 170 1153 1165
66. StephanW 2010 Genetic hitchhiking versus background selection: The controversy and its implications. Philos Trans R Soc Lond B Biol Sci 365 1245 1253
67. McVickerGGordonDDavisCGreenP 2009 Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet 5 e1000471 doi:10.1371/journal.pgen.1000471
68. PayseurBANachmanMW 2002 Gene density and human nucleotide polymorphism. Mol Biol Evol 19 336 340
69. DurbinRMAbecasisGRAltshulerDLAutonA 1000 Genomes Project Consortium 2010 A map of human genome variation from population-scale sequencing. Nature 467 1061 1073
70. AndolfattoP 2007 Hitchhiking effects of recurrent beneficial amino acid substitutions in the Drosophila melanogaster genome. Genome Res 17 1755 1762
71. MacphersonJMSellaGDavisJCPetrovDA 2007 Genomewide spatial correspondence between nonsynonymous divergence and neutral polymorphism reveals extensive adaptation in Drosophila. Genetics 177 2083 2099
72. ShapiroJAHuangWZhangCHubiszMJLuJ 2007 Adaptive genic evolution in the Drosophila genomes. Proc Natl Acad Sci U S A 104 2271 2276
73. BachtrogD 2008 Similar rates of protein adaptation in Drosophila miranda and D. melanogaster, two species with different current effective population sizes. BMC Evol Biol 8 334
74. PalmeAEWrightMSavolainenO 2008 Patterns of divergence among conifer ESTs and polymorphism in Pinus sylvestris identify putative selective sweeps. Mol Biol Evol 25 2567 2577
75. IngvarssonPK 2010 Natural selection on synonymous and nonsynonymous mutations shapes patterns of polymorphism in Populus tremula. Mol Biol Evol 27 650 660
76. JensenJDBachtrogD 2010 Characterizing recurrent positive selection at fast-evolving genes in Drosophila miranda and Drosophila pseudoobscura. Genome Biol Evol 2 371 378
77. HaddrillPRZengKCharlesworthB 2011 Determinants of synonymous and nonsynonymous variability in three species of Drosophila. Mol Biol Evol 28 1731 1743
78. SellaGPetrovDAPrzeworskiMAndolfattoP 2009 Pervasive natural selection in the Drosophila genome? PLoS Genet 5 e1000495 doi:10.1371/journal.pgen.1000495
79. KimSYLohmuellerKEAlbrechtsenALiYKorneliussenT 2011 Estimation of allele frequency and association mapping using next-generation sequencing data. BMC Bioinformatics 12 231
80. KongAThorleifssonGGudbjartssonDFMassonGSigurdssonA 2010 Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467 1099 1103
81. SiepelABejeranoGPedersenJSHinrichsASHouM 2005 Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15 1034 1050
82. LarracuenteAMSacktonTBGreenbergAJWongASinghND 2008 Evolution of protein-coding genes in Drosophila. Trends Genet 24 114 123
83. TajimaF 1990 Relationship between DNA polymorphism and fixation time. Genetics 125 447 454
84. LiYVinckenboschNTianGHuerta-SanchezEJiangT 2010 Resequencing of 200 human exomes identifies an excess of low-frequency non-synonymous coding variants. Nat Genet 42 969 972
85. HudsonRR 2002 Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18 337 338
86. ThorntonK 2005 Recombination and the properties of Tajima's D in the context of approximate-likelihood calculation. Genetics 171 2143 2148
87. BegunDJHollowayAKStevensKHillierLWPohYP 2007 Population genomics: Whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol 5 e310 doi:10.1371/journal.pbio.0050310
88. LercherMJHurstLD 2002 Human SNP variability and mutation rate are higher in regions of high recombination. Trends Genet 18 337 340
89. GaltierNDuretL 2007 Adaptation or biased gene conversion? Extending the null hypothesis of molecular evolution. Trends Genet 23 273 277
90. DuretLGaltierN 2009 Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet 10 285 311
91. PollardKSHubiszMJRosenbloomKRSiepelA 2010 Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res 20 110 121
92. HancockAMAlkorta-AranburuGWitonskyDBDi RienzoA 2010 Adaptations to new environments in humans: The role of subtle allele frequency shifts. Philos Trans R Soc Lond B Biol Sci 365 2459 2468
93. PritchardJKDi RienzoA 2010 Adaptation - not by sweeps alone. Nat Rev Genet 11 665 667
94. PritchardJKPickrellJKCoopG 2010 The genetics of human adaptation: Hard sweeps, soft sweeps, and polygenic adaptation. Curr Biol 20 R208 15
95. AdamsAMHudsonRR 2004 Maximum-likelihood estimation of demographic parameters using the frequency spectrum of unlinked single-nucleotide polymorphisms. Genetics 168 1699 1712
96. MarthGTCzabarkaEMurvaiJSherryST 2004 The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166 351 372
97. VoightBFAdamsAMFrisseLAQianYHudsonRR 2005 Interrogating multiple aspects of variation in a full resequencing data set to infer human population size changes. Proc Natl Acad Sci U S A 102 18508 18513
98. KeinanAMullikinJCPattersonNReichD 2007 Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat Genet 39 1251 1255
99. GutenkunstRNHernandezRDWilliamsonSHBustamanteCD 2009 Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet 5 e1000695 doi:10.1371/journal.pgen.1000695
100. WallJDLohmuellerKEPlagnolV 2009 Detecting ancient admixture and estimating demographic parameters in multiple human populations. Mol Biol Evol 26 1823 1827
101. HammerMFWoernerAEMendezFLWatkinsJCCoxMP 2010 The ratio of human X chromosome to autosome diversity is positively correlated with genetic distance from genes. Nat Genet 42 830 831
102. JorgensenTBorch-JohnsenKThomsenTFIbsenHGlumerC 2003 A randomized non-pharmacological intervention study for prevention of ischaemic heart disease: Baseline results Inter99. Eur J Cardiovasc Prev Rehabil 10 377 386
103. LauritzenTGriffinSBorch-JohnsenKWarehamNJWolffenbuttelBH 2000 The ADDITION study: Proposed trial of the cost-effectiveness of an intensive multifactorial intervention on morbidity and mortality among people with type 2 diabetes detected by screening. Int J Obes Relat Metab Disord 24 Suppl 3 S6 11
104. LiRLiYKristiansenKWangJ 2008 SOAP: Short oligonucleotide alignment program. Bioinformatics 24 713 714
105. LiRYuCLiYLamTWYiuSM 2009 SOAP2: An improved ultrafast tool for short read alignment. Bioinformatics 25 1966 1967
106. KimSYLiYGuoYLiRHolmkvistJ 2010 Design of association studies with pooled or un-pooled next-generation sequencing data. Genet Epidemiol 34 479 491
107. LiRLiYFangXYangHWangJ 2009 SNP detection for massively parallel whole-genome resequencing. Genome Res 19 1124 1132
108. WheelerDASrinivasanMEgholmMShenYChenL 2008 The complete genome of an individual by massively parallel DNA sequencing. Nature 452 872 876
109. LevySSuttonGNgPCFeukLHalpernAL 2007 The diploid genome sequence of an individual human. PLoS Biol 5 e254 doi:10.1371/journal.pbio.0050254
110. DrmanacRSparksABCallowMJHalpernALBurnsNL 2010 Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327 78 81
111. KimSHYiSV 2007 Understanding relationship between sequence and functional evolution in yeast proteins. Genetica 131 151 156
112. NeiMGojoboriT 1986 Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3 418 426
113. HernandezRD 2008 A flexible forward simulator for populations subject to selection and demography. Bioinformatics 24 2786 2787
114. JukesTHCantorCR 1969 Evolution of protein molecules. MunroH Mammalian protein metabolism New York Academic Press 21 123
115. TakahataNSattaYKleinJ 1995 Divergence time and population size in the lineage leading to modern humans. Theor Popul Biol 48 198 221
116. RannalaBYangZ 2003 Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. Genetics 164 1645 1656
117. WallJD 2003 Estimating ancestral population sizes and divergence times. Genetics 163 395 404
118. HobolthAChristensenOFMailundTSchierupMH 2007 Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Genet 3 e7 doi:10.1371/journal.pgen.0030007
119. BurgessRYangZ 2008 Estimation of hominoid ancestral population sizes under Bayesian coalescent models incorporating mutation rate variation and sequencing errors. Mol Biol Evol 25 1979 1994
120. LohmuellerKEBustamanteCDClarkAG 2009 Methods for human demographic inference using haplotype patterns from genomewide single-nucleotide polymorphism data. Genetics 182 217 231
121. MyersSBottoloLFreemanCMcVeanGDonnellyP 2005 A fine-scale map of recombination rates and hotspots across the human genome. Science 310 321 324
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2011 Číslo 10
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
Najčítanejšie v tomto čísle
- The Glycobiome Reveals Mechanisms of Pentose and Hexose Co-Utilization in Bacteria
- Global Mapping of Cell Type–Specific Open Chromatin by FAIRE-seq Reveals the Regulatory Role of the NFI Family in Adipocyte Differentiation
- Genetic Determinants of Serum Testosterone Concentrations in Men
- MicroRNA Expression and Regulation in Human, Chimpanzee, and Macaque Brains