A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans
As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers. Here we present a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data of 185 samples in three major populations detected with two detection methods. This catalog enables us to systematically study mutation rates, population segregation, genomic distribution, and functional properties of MEI polymorphisms and to compare MEI to SNP variation from the same individuals. Population allele frequencies of MEI and SNPs are described, broadly, by the same neutral ancestral processes despite vastly different mutation mechanisms and rates, except in coding regions where MEI are virtually absent, presumably due to strong negative selection. A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations.
Vyšlo v časopise:
A Comprehensive Map of Mobile Element Insertion Polymorphisms in Humans. PLoS Genet 7(8): e32767. doi:10.1371/journal.pgen.1002236
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002236
Souhrn
As a consequence of the accumulation of insertion events over evolutionary time, mobile elements now comprise nearly half of the human genome. The Alu, L1, and SVA mobile element families are still duplicating, generating variation between individual genomes. Mobile element insertions (MEI) have been identified as causes for genetic diseases, including hemophilia, neurofibromatosis, and various cancers. Here we present a comprehensive map of 7,380 MEI polymorphisms from the 1000 Genomes Project whole-genome sequencing data of 185 samples in three major populations detected with two detection methods. This catalog enables us to systematically study mutation rates, population segregation, genomic distribution, and functional properties of MEI polymorphisms and to compare MEI to SNP variation from the same individuals. Population allele frequencies of MEI and SNPs are described, broadly, by the same neutral ancestral processes despite vastly different mutation mechanisms and rates, except in coding regions where MEI are virtually absent, presumably due to strong negative selection. A direct comparison of MEI and SNP diversity levels suggests a differential mobile element insertion rate among populations.
Zdroje
1. CordauxRBatzerMA 2009 The impact of retrotransposons on human genome evolution. Nat Rev Genet 10 691 703
2. CordauxRHedgesDJBatzerMA 2004 Retrotransposition of Alu elements: how many sources? Trends Genet 20 464 467
3. DeiningerPLBatzerMAHutchisonCA3rdEdgellMH 1992 Master genes in mammalian repetitive DNA amplification. Trends Genet 8 307 311
4. MillsREBennettEAIskowRCDevineSE 2007 Which transposable elements are active in the human genome? Trends Genet 23 183 191
5. RheadBKarolchikDKuhnRMHinrichsASZweigAS 2010 The UCSC Genome Browser database: update 2010. Nucleic Acids Res 38 D613 619
6. SmitAHubleyRGreenP 2010 RepeatMasker. wwwrepeatmaskerorg
7. KriegsJOChurakovGJurkaJBrosiusJSchmitzJ 2007 Evolutionary history of 7SL RNA-derived SINEs in Supraprimates. Trends Genet 23 158 161
8. BabushokDVKazazianHHJr 2007 Progress in understanding the biology of the human mutagen LINE-1. Hum Mutat 28 527 539
9. BrouhaBSchustakJBadgeRMLutz-PriggeSFarleyAH 2003 Hot L1s account for the bulk of retrotransposition in the human population. Proc Natl Acad Sci U S A 100 5280 5285
10. OstertagEMGoodierJLZhangYKazazianHHJr 2003 SVA elements are nonautonomous retrotransposons that cause disease in humans. Am J Hum Genet 73 1444 1451
11. WangHXingJGroverDHedgesDJHanK 2005 SVA elements: a hominid-specific retroposon family. J Mol Biol 354 994 1007
12. SenSKHanKWangJLeeJWangH 2006 Human genomic deletions mediated by recombination between Alu elements. Am J Hum Genet 79 41 53
13. HanKLeeJMeyerTJRemediosPGoodwinL 2008 L1 recombination-associated deletions generate human genomic variation. Proc Natl Acad Sci U S A 105 19366 19371
14. BelancioVPDeiningerPLRoy-EngelAM 2009 LINE dancing in the human genome: transposable elements and disease. Genome Med 1 97
15. CoufalNGGarcia-PerezJLPengGEYeoGWMuY 2009 L1 retrotransposition in human neural progenitor cells. Nature 460 1127 1131
16. FaulknerGJKimuraYDaubCOWaniSPlessyC 2009 The regulated retrotransposon transcriptome of mammalian cells. Nat Genet 41 563 571
17. DewannieuxMEsnaultCHeidmannT 2003 LINE-mediated retrotransposition of marked Alu sequences. Nature genetics 35 41 48
18. MoranJVHolmesSENaasTPDeBerardinisRJBoekeJD 1996 High frequency retrotransposition in cultured mammalian cells. Cell 87 917 927
19. KazazianHHJrWongCYoussoufianHScottAFPhillipsDG 1988 Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. Nature 332 164 166
20. MikiYKatagiriTKasumiFYoshimotoTNakamuraY 1996 Mutation analysis in the BRCA2 gene in primary breast cancers. Nat Genet 13 245 247
21. BatzerMADeiningerPL 2002 Alu repeats and human genomic diversity. Nat Rev Genet 3 370 379
22. PangAWMacDonaldJRPintoDWeiJRafiqMA 2010 Towards a comprehensive structural variation map of an individual human genome. Genome Biol 11 R52
23. XingJZhangYHanKSalemAHSenSK 2009 Mobile elements create structural variation: analysis of a complete human genome. Genome Res 19 1516 1526
24. BeckCRCollierPMacfarlaneCMaligMKiddJM 2010 LINE-1 retrotransposition activity in human genomes. Cell 141 1159 1170
25. EwingADKazazianHHJr 2010 High-throughput sequencing reveals extensive variation in human-specific L1 content in individual human genomes. Genome Res
26. HuangCRSchneiderAMLuYNiranjanTShenP 2010 Mobile interspersed repeats are major structural variants in the human genome. Cell 141 1171 1182
27. IskowRCMcCabeMTMillsREToreneSPittardWS 2010 Natural mutagenesis of human genomes by endogenous retrotransposons. Cell 141 1253 1261
28. WitherspoonDJXingJZhangYWatkinsWSBatzerMA 2010 Mobile element scanning (ME-Scan) by targeted high-throughput sequencing. BMC Genomics 11 410
29. WangJSongLGroverDAzrakSBatzerMA 2006 dbRIP: a highly integrated database of retrotransposon insertion polymorphisms in humans. Hum Mutat 27 323 329
30. MillsREWalterKStewartCHandsakerRChenK 2011 Mapping structural variation at fine-scale by population-scale genome sequencing. Nature 470 59 62
31. BentleyDRBalasubramanianSSwerdlowHPSmithGPMiltonJ 2008 Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456 53 59
32. KorbelJOUrbanAEAffourtitJPGodwinBGrubertF 2007 Paired-end mapping reveals extensive structural variation in the human genome. Science 318 420 426
33. SchusterSCMillerWRatanATomshoLPGiardineB 2010 Complete Khoisan and Bantu genomes from southern Africa. Nature 463 943 947
34. EwingADKazazianHH 2010 Whole-genome resequencing allows detection of many rare LINE-1 insertion alleles in humans. Genome Res
35. HormozdiariFAlkanCVenturaMHajirasoulihaIMaligM 2010 Alu repeat discovery and characterization within human genomes. Genome Res
36. Genomes Project Consortium 2010 Towards a comprehensive map of human sequence variation. Nature
37. Chimpanzee Sequencing and Analysis Consortium 2005 Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437 69 87
38. ChenKWallisJWMcLellanMDLarsonDEKalickiJM 2009 BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6 677 681
39. HandsakerREKornJMNemeshJMcCarrollSA 2011 Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet
40. StewartC in preparation SPANNER: a structural variation detection tool
41. YeKSchulzMHLongQApweilerRNingZ 2009 Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25 2865 2871
42. HormozdiariFAlkanCEichlerEESahinalpSC 2009 Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19 1270 1278
43. AbyzovAUrbanAESnyderMGersteinM 2011 CNVnator: An approach to discover, genotype and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res
44. YoonSXuanZMakarovVYeKSebatJ 2009 Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res 19 1586 1592
45. KiddJMCooperGMDonahueWFHaydenHSSampasN 2008 Mapping and sequencing of structural variation from eight human genomes. Nature 453 56 64
46. LevySSuttonGNgPCFeukLHalpernAL 2007 The diploid genome sequence of an individual human. PLoS Biol 5 e254
47. MillsRELuttigCTLarkinsCEBeauchampATsuiC 2006 An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res 16 1182 1190
48. GordonDDesmaraisCGreenP 2001 Automated finishing with autofinish. Genome Res 11 614 625
49. KentWJ 2002 BLAT–the BLAST-like alignment tool. Genome Res 12 656 664
50. BrowningSRBrowningBL 2007 Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet 81 1084 1097
51. EwingBGreenP 1998 Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 8 186 194
52. KimuraM 1968 Evolutionary rate at the molecular level. Nature 217 624 626
53. WattersonGA 1975 On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7 256 276
54. FuYX 1995 Statistical properties of segregating sites. Theor Popul Biol 48 172 197
55. SchaffnerSFFooCGabrielSReichDDalyMJ 2005 Calibrating a coalescent simulation of human genome sequence variation. Genome Res 15 1576 1583
56. GutenkunstRNHernandezRDWilliamsonSHBustamanteCD 2009 Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet 5 e1000695
57. MarthGTCzabarkaEMurvaiJSherryST 2004 The allele frequency spectrum in genome-wide human variation data reveals signals of differential demographic history in three large world populations. Genetics 166 351 372
58. XingJWatkinsWSWitherspoonDJZhangYGutherySL 2009 Fine-scaled human genetic structure revealed by SNP microarrays. Genome Res 19 815 825
59. RedonRIshikawaSFitchKRFeukLPerryGH 2006 Global variation in copy number in the human genome. Nature 444 444 454
60. ThomsonRPritchardJKShenPOefnerPJFeldmanMW 2000 Recent common ancestry of human Y chromosomes: evidence from DNA sequence data. Proc Natl Acad Sci U S A 97 7360 7365
61. NachmanMWCrowellSL 2000 Estimate of the mutation rate per nucleotide in humans. Genetics 156 297 304
62. BurgessRYangZ 2008 Estimation of hominoid ancestral population sizes under bayesian coalescent models incorporating mutation rate variation and sequencing errors. Mol Biol Evol 25 1979 1994
63. ChenFCLiWH 2001 Genomic divergences between humans and other hominoids and the effective population size of the common ancestor of humans and chimpanzees. Am J Hum Genet 68 444 456
64. WallJD 2003 Estimating ancestral population sizes and divergence times. Genetics 163 395 404
65. TajimaF 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123 585 595
66. MillsREBennettEAIskowRCLuttigCTTsuiC 2006 Recently mobilized transposons in the human and chimpanzee genomes. Am J Hum Genet 78 671 679
67. KondrashovFAKondrashovAS 2010 Measurements of spontaneous rates of mutations in the recent past and the near future. Philos Trans R Soc Lond B Biol Sci 365 1169 1176
68. CordauxRHedgesDJHerkeSWBatzerMA 2006 Estimating the retrotransposition rate of human Alu elements. Gene 373 134 137
69. JurkaJKapitonovVVPavlicekAKlonowskiPKohanyO 2005 Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110 462 467
70. StrombergMPLeeWPMarthGT in preparation MOSAIK: A next-generation reference-guided aligner
71. KnuthDE 1968 The art of computer programming Reading, Mass. Addison-Wesley Pub. Co
72. YouseffS 1987 Clustering with local equivalence relations. Computer Physics Communications 45 423 426
73. HuangW in preparation ART: Next-generation read simulator
74. LiHRuanJDurbinR 2008 Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18 1851 1858
75. KonkelMKWangJLiangPBatzerMA 2007 Identification and characterization of novel polymorphic LINE-1 insertions through comparison of two human genome sequence assemblies. Gene 390 28 38
76. SeberGAF 2002 Estimation of animal abundance and related parameters Caldwell, N.J. Blackburn Press xvii, 654 p.
77. HartlDLClarkAG 2007 Principles of population genetics Sunderland, Mass. Sinauer Associates
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2011 Číslo 8
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- An EMT–Driven Alternative Splicing Program Occurs in Human Breast Cancer and Modulates Cellular Phenotype
- Chromosome Painting Reveals Asynaptic Full Alignment of Homologs and HIM-8–Dependent Remodeling of Chromosome Territories during Meiosis
- Discovery of Sexual Dimorphisms in Metabolic and Genetic Biomarkers
- Regulation of p53/CEP-1–Dependent Germ Cell Apoptosis by Ras/MAPK Signaling