Phased Whole-Genome Genetic Risk in a Family Quartet Using a Major Allele Reference Sequence
Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (<1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.
Vyšlo v časopise:
Phased Whole-Genome Genetic Risk in a Family Quartet Using a Major Allele Reference Sequence. PLoS Genet 7(9): e32767. doi:10.1371/journal.pgen.1002280
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002280
Souhrn
Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (<1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.
Zdroje
1. HindorffLASethupathyPJunkinsHARamosEMMehtaJP 2009 Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106 9362 9367
2. DurbinRMAbecasisGRAltshulerDLAutonABrooksLD 2010 A map of human genome variation from population-scale sequencing. Nature 467 1061 1073
3. RoachJCGlusmanGSmitAFHuffCDHubleyR 2010 Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328 636 639
4. AshleyEAButteAJWheelerMTChenRKleinTE 2010 Clinical assessment incorporating a personal genome. Lancet 375 1525 1535
5. PruittKDTatusovaTMaglottDR 2007 NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35 D61 65
6. ChenRButteAJ 2011 The reference human genome demonstrates high risk of type 1 diabetes and other disorders. Pac Symp Biocomput 231 242
7. DegnerJFMarioniJCPaiAAPickrellJKNkadoriE 2009 Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25 3207 3212
8. WattersonGA 1975 On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7 256 276
9. KongAThorleifssonGStefanssonHMassonGHelgasonA 2008 Sequence variants in the RNF212 gene associate with genome-wide recombination rate. Science 319 1398 1401
10. BaudatFBuardJGreyCFledel-AlonAOberC 2010 PRDM9 is a major determinant of meiotic recombination hotspots in humans and mice. Science 327 836 840
11. MyersSBowdenRTumianABontropREFreemanC 2010 Drive against hotspot motifs in primates implicates the PRDM9 gene in meiotic recombination. Science 327 876 879
12. ParvanovEDPetkovPMPaigenK 2010 Prdm9 controls activation of mammalian recombination hotspots. Science 327 835
13. KongAThorleifssonGGudbjartssonDFMassonGSigurdssonA 2010 Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467 1099 1103
14. AdzhubeiIASchmidtSPeshkinLRamenskyVEGerasimovaA 2010 A method and server for predicting damaging missense mutations. Nat Methods 7 248 249
15. NgPCHenikoffS 2003 SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31 3812 3814
16. CooperGMGoodeDLNgSBSidowABamshadMJ 2010 Single-nucleotide evolutionary constraint scores highlight disease-causing mutations. Nature methods 7 250 251
17. CooperGMStoneEAAsimenosGGreenEDBatzoglouS 2005 Distribution and intensity of constraint in mammalian genomic sequence. Genome research 15 901 913
18. KimuraM 1968 Evolutionary rate at the molecular level. Nature 217 624 626
19. KosterTRosendaalFRde RondeHBrietEVandenbrouckeJP 1993 Venous thrombosis due to poor anticoagulant response to activated protein C: Leiden Thrombophilia Study. Lancet 342 1503 1506
20. RidkerPMHennekensCHLindpaintnerKStampferMJEisenbergPR 1995 Mutation in the gene coding for coagulation factor V and the risk of myocardial infarction, stroke, and venous thrombosis in apparently healthy men. N Engl J Med 332 912 917
21. RidkerPMHennekensCHSelhubJMiletichJPMalinowMR 1997 Interrelation of hyperhomocyst(e)inemia, factor V Leiden, and risk of future venous thromboembolism. Circulation 95 1777 1782
22. MargaglioneMD′AndreaGd′AddeddaMGiulianiNCappucciG 1998 The methylenetetrahydrofolate reductase TT677 genotype is associated with venous thrombosis independently of the coexistence of the FV Leiden and the prothrombin A20210 mutation. Thromb Haemost 79 907 911
23. RoemischJFeussnerANerlichCStoehrHAWeimerT 2002 The frequent Marburg I polymorphism impairs the pro-urokinase activating potency of the factor VII activating protease (FSAP). Blood Coagul Fibrinolysis 13 433 441
24. SeddingDDanielJMMuhlLHersemeyerKBrunschH 2006 The G534E polymorphism of the gene encoding the factor VII-activating protease is associated with cardiovascular risk due to increased neointima formation. J Exp Med 203 2801 2807
25. HoppeBTolouFRadtkeHKiesewetterHDornerT 2005 Marburg I polymorphism of factor VII-activating protease is associated with idiopathic venous thromboembolism. Blood 105 1549 1551
26. MacayaDKatsanisSHHefferonTWAudlinSMendelsohnNJ 2009 A synonymous mutation in TCOF1 causes Treacher Collins syndrome due to mis-splicing of a constitutive exon. Am J Med Genet A 149A 1624 1627
27. SmithANSkaugJChoateKANayirABakkalogluA 2000 Mutations in ATP6N1B, encoding a new kidney vacuolar proton pump 116-kD subunit, cause recessive distal renal tubular acidosis with preserved hearing. Nat Genet 26 71 75
28. NelsonMRBrycKKingKSIndapABoykoAR 2008 The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. Am J Hum Genet 83 347 358
29. TjonJMvan BergenJKoningF 2010 Celiac disease: how complicated can it get? Immunogenetics 62 641 651
30. van BelleTLCoppietersKTvon HerrathMG 2011 Type 1 diabetes: etiology, immunology, and therapeutic strategies. Physiol Rev 91 79 118
31. ShiinaTInokoHKulskiJK 2004 An update of the HLA genomic region, locus information and disease associations: 2004. Tissue Antigens 64 631 649
32. de BakkerPIMcVeanGSabetiPCMirettiMMGreenT 2006 A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet 38 1166 1172
33. KleinTEAltmanRBErikssonNGageBFKimmelSE 2009 Estimation of the warfarin dose with clinical and pharmacogenetic data. N Engl J Med 360 753 764
34. BrennanRMBurrowsSR 2008 A mechanism for the HLA-A*01-associated risk for EBV+ Hodgkin lymphoma and infectious mononucleosis. Blood 112 2589 2590
35. KongASteinthorsdottirVMassonGThorleifssonGSulemP 2009 Parental origin of sequence variants associated with complex diseases. Nature 462 868 874
36. RipattiSTikkanenEOrho-MelanderMHavulinnaASSilanderK 2010 A multilocus genetic risk score for coronary heart disease: case-control and prospective cohort analyses. Lancet 376 1393 1400
37. PaigenKSzatkiewiczJPSawyerKLeahyNParvanovED 2008 The recombinational anatomy of a mouse chromosome. PLoS Genet 4 e1000119 doi:10.1371/journal.pgen.1000119
38. PetkovPMBromanKWSzatkiewiczJPPaigenK 2007 Crossover interference underlies sex differences in recombination rates. Trends Genet 23 539 542
39. KongAGudbjartssonDFSainzJJonsdottirGMGudjonssonSA 2002 A high-resolution recombination map of the human genome. Nat Genet 31 241 247
40. CoopGWenXOberCPritchardJKPrzeworskiM 2008 High-resolution mapping of crossovers reveals extensive variation in fine-scale recombination patterns among humans. Science 319 1395 1398
41. BromanKWMurrayJCSheffieldVCWhiteRLWeberJL 1998 Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am J Hum Genet 63 861 869
42. KujovichJL 2011 Factor V Leiden thrombophilia. Genet Med 13 1 16
43. OrmondKEWheelerMTHudginsLKleinTEButteAJ 2010 Challenges in the clinical application of whole-genome sequencing. Lancet 375 1749 1751
44. SamaniNJTomaszewskiMSchunkertH 2010 The personal genome--the future of personalised medicine? Lancet 375 1497 1498
45. BellCJDinwiddieDLMillerNAHateleySLGanusovaEE 2011 Carrier testing for severe childhood recessive diseases by next-generation sequencing. Science translational medicine 3 65ra64
46. KohaneISMasysDRAltmanRB 2006 The incidentalome: a threat to genomic medicine. JAMA : the journal of the American Medical Association 296 212 215
47. GreenRCRobertsJSCupplesLARelkinNRWhitehousePJ 2009 Disclosure of APOE genotype for risk of Alzheimer's disease. The New England journal of medicine 361 245 254
48. BlossCSSchorkNJTopolEJ 2011 Effect of direct-to-consumer genomewide profiling to assess disease risk. The New England journal of medicine 364 524 534
49. LiHDurbinR 2009 Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25 1754 1760
50. McKennaAHannaMBanksESivachenkoACibulskisK 2010 The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome research 20 1297 1303
51. JohnsonADHandsakerREPulitSLNizzariMMO′DonnellCJ 2008 SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24 2938 2939
52. FitchW 1971 Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology. Systematic Zoology 20 406 416
53. ScottSASangkuhlKGardnerEESteinCMHulotJS 2011 Clinical Pharmacogenetics Implementation Consortium Guidelines for Cytochrome P450-2C19 (CYP2C19) Genotype and Clopidogrel Therapy. Clinical pharmacology and therapeutics 90 328 332
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2011 Číslo 9
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- The Evolutionarily Conserved Longevity Determinants HCF-1 and SIR-2.1/SIRT1 Collaborate to Regulate DAF-16/FOXO
- Genome-Wide Analysis of Heteroduplex DNA in Mismatch Repair–Deficient Yeast Cells Reveals Novel Properties of Meiotic Recombination Pathways
- Association of eGFR-Related Loci Identified by GWAS with Incident CKD and ESRD
- MicroRNA Predictors of Longevity in