The Characterization of Twenty Sequenced Human Genomes
We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten “case” genomes from individuals with severe hemophilia A and ten “control” genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.
Vyšlo v časopise:
The Characterization of Twenty Sequenced Human Genomes. PLoS Genet 6(9): e32767. doi:10.1371/journal.pgen.1001111
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1001111
Souhrn
We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten “case” genomes from individuals with severe hemophilia A and ten “control” genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.
Zdroje
1. LiH
DurbinR
2009 Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25 1754 1760
2. LiH
HandsakerB
WysokerA
FennellT
RuanJ
2009 The Sequence Alignment/Map format and SAMtools. Bioinformatics 25 2078 2079
3. BentleyDR
BalasubramanianS
SwerdlowHP
SmithGP
MiltonJ
2008 Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456 53 59
4. WangJ
WangW
LiR
LiY
TianG
2008 The diploid genome sequence of an Asian individual. Nature 456 60 65
5. WheelerDA
SrinivasanM
EgholmM
ShenY
ChenL
2008 The complete genome of an individual by massively parallel DNA sequencing. Nature 452 872 876
6. AhnSM
KimTH
LeeS
KimD
GhangH
2009 The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. Genome Res 19 1622 1629
7. LupskiJR
ReidJG
Gonzaga-JaureguiC
Rio DeirosD
ChenDC
2010 Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy. N Engl J Med 362 1181 1191
8. SobreiraNLM
CirulliET
AvramopoulosD
WohlerE
OswaldGL
2010 Whole genome sequencing of a single individual identifies a Mendelian disease gene. PLoS Genet 6 e1000991
9. RoachJC
GlusmanG
SmitAF
HuffCD
HubleyR
2010 Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328 636 639
10. LeyTJ
MardisER
DingL
FultonB
McLellanMD
2008 DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456 66 72
11. LeeW
JiangZ
LiuJ
HavertyPM
GuanY
2010 The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465 473 477
12. PriceAL
PattersonNJ
PlengeRM
WeinblattME
ShadickNA
2006 Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38 904 909
13. HubbardTJ
AkenBL
AylingS
BallesterB
BealK
2009 Ensembl 2009. Nucleic Acids Res 37 D690 697
14. FrazerKA
BallingerDG
CoxDR
HindsDA
StuveLL
2007 A second generation human haplotype map of over 3.1 million SNPs. Nature 449 851 861
15. LevyS
SuttonG
NgPC
FeukL
HalpernAL
2007 The diploid genome sequence of an individual human. PLoS Biol 5 e254
16. HaleMC
McCormickCR
JacksonJR
DewoodyJA
2009 Next-generation pyrosequencing of gonad transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of normalization and rarefaction in gene discovery. BMC Genomics 10 203
17. ZhuM
NeedAC
GeD
SinghA
FengS
2010 Detection of copy number variation using whole genome sequence data from twenty human genomes. Manuscript in preparation
18. AlkanC
KiddJM
Marques-BonetT
AksayG
AntonacciF
2009 Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41 1061 1067
19. WangK
LiM
HadleyD
LiuR
GlessnerJ
2007 PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17 1665 1674
20. ChenK
WallisJW
McLellanMD
LarsonDE
KalickiJM
2009 BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6 677 681
21. McCarrollSA
KuruvillaFG
KornJM
CawleyS
NemeshJ
2008 Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet 40 1166 1174
22. IafrateAJ
FeukL
RiveraMN
ListewnikML
DonahoePK
2004 Detection of large-scale variation in the human genome. Nat Genet 36 949 951
23. GeD
RuzzoEK
ShiannaKV
HeM
AllenA
2010 Annotation, visualization, and analysis of variants emerging from whole-genome and whole-exome sequencing using SVA. Manuscript in preparation
24. The 1000 Genomes Project 2009 http://www.1000genomes.org/page.php
25. NgSB
TurnerEH
RobertsonPD
FlygareSD
BighamAW
2009 Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461 272 276
26. EyreTA
DucluzeauF
SneddonTP
PoveyS
BrufordEA
2006 The HUGO Gene Nomenclature Database, 2006 updates. Nucleic Acids Res 34 D319 321
27. GiladY
ManO
PaaboS
LancetD
2003 Human specific loss of olfactory receptor genes. Proc Natl Acad Sci U S A 100 3324 3327
28. NgPC
LevyS
HuangJ
StockwellTB
WalenzBP
2008 Genetic variation in an individual human exome. PLoS Genet 4 e1000160
29. MalhisN
JonesSJM
2010 High quality SNP calling using Illumina data at shallow coverage. Bioinformatics 26 1029 1035
30. NaylorJ
BrinkeA
HassockS
GreenPM
GiannelliF
1993 Characteristic mRNA abnormality found in half the patients with severe haemophilia A is due to large DNA inversions. Hum Mol Genet 2 1773 1778
31. AntonarakisSE
KazazianHH
TuddenhamEG
1995 Molecular etiology of factor VIII deficiency in hemophilia A. Hum Mutat 5 1 22
32. CirulliET
GoldsteinDB
2010 Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 11 415 425
33. ChoiM
SchollUI
JiW
LiuT
TikhonovaIR
2009 Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A 106 19096 19101
34. NgSB
BuckinghamKJ
LeeC
BighamAW
TaborHK
2010 Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42 30 35
35. KrawitzP
RodelspergerC
JagerM
JostinsL
BauerS
2010 Microindel detection in short-read sequence data. Bioinformatics 26 722 729
36. KoboldtDC
2010 Challenges of sequencing human genomes. Brief Bioinform Advance publication 2 June 2010
37. LiH
RuanJ
DurbinR
2008 Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18 1851 1858
38. ScallyA
BentleyDR
2009 Personal Communication Hidden Markov Model for Copy-number Variation
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2010 Číslo 9
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Synthesizing and Salvaging NAD: Lessons Learned from
- Optimal Strategy for Competence Differentiation in Bacteria
- Long- and Short-Term Selective Forces on Malaria Parasite Genomes
- Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data