#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation


In the past 10 years, many genome wide association studies (GWAS) have been conducted to identify the genetic bases of complex human traits. As of January, 2014, more than 12,000 single-nucleotide polymorphisms (SNPs) have been reported to be significantly associated with at least one complex trait/disease. On one hand, about 85% of identified risk variants are located in non-coding regions, which motivates a systematic understanding of the function of non-coding variants in regulatory elements in the human genome. On the other hand, complex diseases are often affected by many genetic variants with small or moderate effects. To address these issues, we propose a statistical approach, GPA, to integrating information from multiple GWAS datasets and functional annotation. Notably, our approach only requires marker-wise p-values as input, making it especially useful when only summary statistics, instead of the full genotype and phenotype data, are available. We applied GPA to analyze GWAS datasets of five psychiatric disorders and bladder cancer, where the central nervous system genes, eQTLs from the Genotype-Tissue Expression (GTEx), and the ENCODE DNase-seq data from 125 cell lines were used as functional annotation. The analysis results suggest that GPA is an effective method for integrative data analysis in the post-GWAS era.


Vyšlo v časopise: GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation. PLoS Genet 10(11): e32767. doi:10.1371/journal.pgen.1004787
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004787

Souhrn

In the past 10 years, many genome wide association studies (GWAS) have been conducted to identify the genetic bases of complex human traits. As of January, 2014, more than 12,000 single-nucleotide polymorphisms (SNPs) have been reported to be significantly associated with at least one complex trait/disease. On one hand, about 85% of identified risk variants are located in non-coding regions, which motivates a systematic understanding of the function of non-coding variants in regulatory elements in the human genome. On the other hand, complex diseases are often affected by many genetic variants with small or moderate effects. To address these issues, we propose a statistical approach, GPA, to integrating information from multiple GWAS datasets and functional annotation. Notably, our approach only requires marker-wise p-values as input, making it especially useful when only summary statistics, instead of the full genotype and phenotype data, are available. We applied GPA to analyze GWAS datasets of five psychiatric disorders and bladder cancer, where the central nervous system genes, eQTLs from the Genotype-Tissue Expression (GTEx), and the ENCODE DNase-seq data from 125 cell lines were used as functional annotation. The analysis results suggest that GPA is an effective method for integrative data analysis in the post-GWAS era.


Zdroje

1. HindorffL, SethupathyP, JunkinsH, RamosE, MehtaJ, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences 106: 9362.

2. ManolioTA, CollinsFS, CoxNJ, GoldsteinDB, HindorffLA, et al. (2009) Finding the missing heritability of complex diseases. Nature 461: 747–753.

3. VisscherPM, HillWG, WrayNR (2008) Heritability in the genomics era - concepts and misconceptions. Nature Reviews Genetics 9: 255–266.

4. AllenHL, EstradaK, LettreG, BerndtSI, WeedonMN, et al. (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467: 832–838.

5. VisscherPM (2008) Sizing up human height variation. Nature genetics 40: 489–490.

6. MaherB (2008) Personal genomes: The case of the missing heritability. Nature 456: 18–21.

7. ManolioT (2010) Genomewide association studies and assessment of the risk of disease. The New England Journal of Medicine 363: 166–176.

8. HuntKA, MistryV, BockettNA, AhmadT, BanM, et al. (2013) Negligible impact of rare autoimmune-locus coding-region variants on missing heritability. Nature 498: 232–235.

9. YangJ, BenyaminB, McEvoyBP, GordonS, HendersAK, et al. (2010) Common SNPs explain a large proportion of the heritability for human height. Nature genetics 42: 565–569.

10. VisscherPM, BrownMA, McCarthyMI, YangJ (2012) Five years of GWAS discovery. The American Journal of Human Genetics 90: 7–24.

11. VattikutiS, GuoJ, ChowCC (2012) Heritability and genetic correlations explained by common SNPs for metabolic syndrome traits. PLoS genetics 8: e1002637.

12. Cross-Disorder Group of the Psychiatric Genomics Consortium (2013) Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nature genetics 45: 984–994.

13. LeeSH, DeCandiaTR, RipkeS, YangJ, SullivanPF, et al. (2012) Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs. Nature genetics 44: 247–250.

14. Yang C, Li C, Kranzler HR, Farrer LA, Zhao H, et al.. (2014) Exploring the genetic architecture of alcohol dependence in African-Americans via analysis of a genomewide set of common variants. Human genetics: 1–8.

15. MorrisAP, VoightBF, TeslovichTM, FerreiraT, SegreAV, et al. (2012) Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nature genetics 44: 981–990.

16. SivakumaranS, AgakovF, TheodoratouE, PrendergastJG, ZgagaL, et al. (2011) Abundant pleiotropy in human complex diseases and traits. The American Journal of Human Genetics 89: 607–618.

17. AndreassenOA, DjurovicS, ThompsonWK, SchorkAJ, KendlerKS, et al. (2013) Improved detection of common variants associated with schizophrenia by leveraging pleiotropy with cardiovascular-disease risk factors. The American Journal of Human Genetics 92: 97–109.

18. Cross-Disorder Group of the Psychiatric Genomics Consortium (2013) Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. Lancet 381: 1371–1379.

19. SakodaLC, JorgensonE, WitteJS (2013) Turning of COGS moves forward findings for hormonally mediated cancers. Nature Genetics 45: 345–348.

20. SchorkAJ, ThompsonWK, PhamP, TorkamaniA, RoddeyJC, et al. (2013) All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs. PLoS genetics 9: e1003449.

21. YangJ, ManolioTA, PasqualeLR, BoerwinkleE, CaporasoN, et al. (2011) Genome partitioning of genetic variation for complex traits using common snps. Nature genetics 43: 519–525.

22. NicolaeDL, GamazonE, ZhangW, DuanS, DolanME, et al. (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS genetics 6: e1000888.

23. The ENCODE Project Consortium (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74.

24. SolovieffN, CotsapasC, LeePH, PurcellSM, SmollerJW (2013) Pleiotropy in complex traits: challenges and strategies. Nature Reviews Genetics 14: 483–495.

25. ShrinerD (2012) Moving toward system genetics through multiple trait analysis in genome-wide association studies. Frontiers in genetics 3: 1.

26. LeeSH, YangJ, GoddardME, VisscherPM, WrayNR (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics 28: 2540–2542.

27. Li C, Yang C, Gelernter J, Zhao H (2013) Improving genetic risk prediction by leveraging pleiotropy. Human genetics: 1–12.

28. AndreassenOA, ThompsonWK, SchorkAJ, RipkeS, MattingsdalM, et al. (2013) Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate. PLoS genetics 9: e1003455.

29. EdwardsSL, BeesleyJ, FrenchJD, DunningAM (2013) Beyond GWASs: Illuminating the Dark Road from Association to Function. The American Journal of Human Genetics 93: 779–797.

30. CantorR, LangeK, SinsheimerJ (2010) Prioritizing GWAS results: A review of statistical methods and recommendations for their application. The American Journal of Human Genetics 86: 6–22.

31. WardL, KellisM (2012) Interpreting noncoding genetic variation in complex traits and human disease. Nature Biotechnology 30: 1095–1106.

32. SubramanianA, TamayoP, MoothaVK, MukherjeeS, EbertBL, et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 102: 15545–15550.

33. BoyleA, HongE, HariharanM, ChengY, SchaubM, et al. (2012) Annotation of functional variation in personal genomes using RegulomeDB. Genome Research 22: 1790–1797.

34. RaychaudhuriS, KornJM, McCarrollSA, AltshulerD, SklarP, et al. (2010) Accurately assessing the risk of schizophrenia conferred by rare copy-number variation affecting genes with brain function. PLoS genetics 6: e1001097.

35. SklarP, RipkeS, ScottLJ, AndreassenOA, CichonS, et al. (2011) Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4. Nature genetics 43: 977.

36. LonsdaleJ, ThomasJ, SalvatoreM, PhillipsR, LoE, et al. (2013) The genotype-tissue expression (gtex) project. Nature genetics 45: 580–585.

37. WangK, LiM, HakonarsonH (2010) Annovar: functional annotation of genetic variants from high-throughput sequencing data. Nucleic acids research 38: e164–e164.

38. ThurmanRE, RynesE, HumbertR, VierstraJ, MauranoMT, et al. (2012) The accessible chromatin landscape of the human genome. Nature 489: 75–82.

39. RothmanN, Garcia-ClosasM, ChatterjeeN, MalatsN, WuX, et al. (2010) A multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci. Nature genetics 42: 978–984.

40. LeeSH, WrayNR, GoddardME, VisscherPM (2011) Estimating missing heritability for disease from genome-wide association studies. The American Journal of Human Genetics 88: 294–305.

41. YangJ, LeeSH, GoddardME, VisscherPM (2011) GCTA: a tool for genome-wide complex trait analysis. The American Journal of Human Genetics 88: 76–82.

42. PriceAL, PattersonNJ, PlengeRM, WeinblattME, ShadickNA, et al. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nature genetics 38: 904–909.

43. KangHM, SulJH, ZaitlenNA, KongSy, FreimerNB, et al. (2010) Variance component model to account for sample structure in genome-wide association studies. Nature genetics 42: 348–354.

44. LippertC, ListgartenJ, LiuY, KadieC, DavidsonR, et al. (2011) Fast linear mixed models for genome-wide association studies. Nature Methods 8: 833–835.

45. ZhouX, StephensM (2012) Genome-wide efficient mixed-model analysis for association studies. Nature genetics 44: 821–824.

46. Efron B (2008) Microarrays, empirical Bayes and the two-groups model. Statistical Science: 1–22.

47. PoundsS, MorrisSW (2003) Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics 19: 1236–1242.

48. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B (Methodological): 1–38.

49. McLachlan G, Krishnan T (2008) The EM algorithm and extensions. John Wiley & Sons.

50. Efron B (2010) Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Cambridge University Press.

51. NewtonM, NoueiryA, SarkarD, AhlquistP (2004) Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 5: 155–176.

52. Shao J (2003) Mathematical statistics. Springer, 2nd edition.

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2014 Číslo 11
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#