#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies


Genome-wide association studies (GWAS) have successfully identified numerous regions in the genome that harbor genetic variants that increase risk for various complex traits and diseases. However, it is generally the case that GWAS risk variants are not themselves causally affecting the trait, but rather, are correlated to the true causal variant through linkage disequilibrium (LD). Plausible causal variants are identified in fine-mapping studies through targeted sequencing followed by prioritization of variants for functional validation. In this work, we propose methods that leverage two sources of independent information, the association strength and genomic functional location, to prioritize causal variants. We demonstrate in simulations and empirical data that our approach reduces the number of SNPs that need to be selected for follow-up to identify the true causal variants at GWAS risk loci.


Vyšlo v časopise: Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies. PLoS Genet 10(10): e32767. doi:10.1371/journal.pgen.1004722
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004722

Souhrn

Genome-wide association studies (GWAS) have successfully identified numerous regions in the genome that harbor genetic variants that increase risk for various complex traits and diseases. However, it is generally the case that GWAS risk variants are not themselves causally affecting the trait, but rather, are correlated to the true causal variant through linkage disequilibrium (LD). Plausible causal variants are identified in fine-mapping studies through targeted sequencing followed by prioritization of variants for functional validation. In this work, we propose methods that leverage two sources of independent information, the association strength and genomic functional location, to prioritize causal variants. We demonstrate in simulations and empirical data that our approach reduces the number of SNPs that need to be selected for follow-up to identify the true causal variants at GWAS risk loci.


Zdroje

1. HindorffLA, SethupathyP, JunkinsHA, RamosEM, MehtaJP, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences 106: 9362–9367.

2. MeyerKB, OReillyM, MichailidouK, CarleburS, EdwardsSL, et al. (2013) Fine-scale mapping of the fgfr2 breast cancer risk locus: putative functional variants differentially bind foxa1 and e2f1. The American Journal of Human Genetics 93: 1046–1060.

3. Kote-JaraiZ, SaundersEJ, LeongamornlertDA, TymrakiewiczM, DadaevT, et al. (2013) Fine-mapping identifies multiple prostate cancer risk loci at 5p15, one of which associates with tert expression. Human molecular genetics 22: 2520–2528.

4. WuY, WaiteLL, JacksonAU, SheuWH, BuyskeS, et al. (2013) Trans-ethnic fine-mapping of lipid loci identifies population-specific signals and allelic heterogeneity that increases the trait variance explained. PLoS genetics 9: e1003379.

5. MallerJB, McVeanG, ByrnesJ, VukcevicD, PalinK, et al. (2012) Bayesian refinement of association signals for 14 loci in 3 common diseases. Nature genetics 44: 1294–1301.

6. FayeLL, MachielaMJ, KraftP, BullSB, SunL (2013) Re-ranking sequencing variants in the post-gwas era for accurate causal variant identification. PLoS genetics 9: e1003609.

7. Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, Eskin E (2014) Identifying causal variants at loci with multiple signals of association. Genetics: genetics–114.

8. TypeAGEN, TypeSA, ConsortiumDS, TypeMA, ConsortiumDM, et al. (2014) Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nature genetics 46: 234–244.

9. Consortium IMSG, et al.. (2013) Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nature genetics.

10. PickrellJK (2014) Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. The American Journal of Human Genetics 94: 559–573.

11. GaffneyDJ, VeyrierasJB, DegnerJF, Pique-RegiR, PaiAA, et al. (2012) Dissecting the regulatory architecture of gene expression qtls. Genome Biol 13: R7.

12. ZuberV, SilvaAPD, StrimmerK (2012) A novel algorithm for simultaneous snp selection in high-dimensional genome-wide association studies. BMC bioinformatics 13: 284.

13. ValdarW, SabourinJ, NobelA, HolmesCC (2012) Reprioritizing genetic associations in hit regions using lasso-based resample model averaging. Genetic epidemiology 36: 451–462.

14. GuanY, StephensM, et al. (2011) Bayesian variable selection regression for genome-wide association studies and other large-scale problems. The Annals of Applied Statistics 5: 1780–1815.

15. ServinB, StephensM (2007) Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS genetics 3: e114.

16. LeeSI, DudleyAM, DrubinD, SilverPA, KroganNJ, et al. (2009) Learning a prior on regulatory potential from eqtl data. PLoS genetics 5: e1000358.

17. CarbonettoP, StephensM (2013) Integrated enrichment analysis of variants and pathways in genome-wide association studies indicates central role for il-2 signaling genes in type 1 diabetes, and cytokine signaling genes in crohn's disease. PLoS genetics 9: e1003770.

18. ConsortiumEP, et al. (2012) An integrated encyclopedia of dna elements in the human genome. Nature 489: 57–74.

19. MauranoMT, HumbertR, RynesE, ThurmanRE, HaugenE, et al. (2012) Systematic localization of common disease-associated variation in regulatory dna. Science 337: 1190–1195.

20. TrynkaG, RaychaudhuriS (2013) Using chromatin marks to interpret and localize genetic associations to complex human traits and diseases. Current opinion in genetics & development 23: 635–641.

21. KarczewskiKJ, DudleyJT, KukurbaKR, ChenR, ButteAJ, et al. (2013) Systematic functional regulatory assessment of disease-associated variants. Proceedings of the National Academy of Sciences 110: 9607–9612.

22. TrynkaG, SandorC, HanB, XuH, StrangerBE, et al. (2013) Chromatin marks identify critical cell types for fine mapping complex trait variants. Nature genetics 45: 124–130.

23. Gusev A, Lee SH, Neale BM, Trynka G, Vilhjalmsson BJ, et al.. (2014) Regulatory variants explain much more heritability than coding variants across 11 common diseases. bioRxiv.

24. UdlerMS, MeyerKB, PooleyKA, KarlinsE, StruewingJP, et al. (2009) Fgfr2 variants and breast cancer risk: fine-scale mapping using african american studies and analysis of chromatin conformation. Human molecular genetics 18: 1692–1703.

25. TrynkaG, HuntKA, BockettNA, RomanosJ, MistryV, et al. (2011) Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease. Nature genetics 43: 1193–1201.

26. PatsopoulosNA, BarcellosLF, HintzenRQ, SchaeferC, van DuijnCM, et al. (2013) Fine-mapping the genetic association of the major histocompatibility complex in multiple sclerosis: Hla and non-hla effects. PLoS genetics 9: e1003926.

27. LiuJZ, AlmarriMA, GaffneyDJ, MellsGF, JostinsL, et al. (2012) Dense fine-mapping study identifies new susceptibility loci for primary biliary cirrhosis. Nature genetics 44: 1137–1141.

28. FellayJ, ThompsonAJ, GeD, GumbsCE, UrbanTJ, et al. (2010) Itpa gene variants protect against anaemia in patients treated for chronic hepatitis c. Nature 464: 405–408.

29. LewingerJP, ContiDV, BaurleyJW, TricheTJ, ThomasDC (2007) Hierarchical bayes prioritization of marker associations from a genome-wide association scan for further investigation. Genetic epidemiology 31: 871–882.

30. QuintanaM, ContiD (2013) Integrative variable selection via bayesian model uncertainty. Statistics in medicine 32: 4938–4953.

31. UdlerMS, TyrerJ, EastonDF (2010) Evaluating the power to discriminate between highly correlated snps in genetic association studies. Genetic epidemiology 34: 463–468.

32. Carlin BP, Louis TA (2000) Bayes and empirical Bayes methods for data analysis. CRC Press.

33. TeslovichTM, MusunuruK, SmithAV, EdmondsonAC, StylianouIM, et al. (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466: 707–713.

34. Pasaniuc B, Zaitlen N, Shi H, Bhatia G, Gusev A, et al.. (2014) Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics: btu416.

35. HanB, KangHM, EskinE (2009) Rapid and accurate multiple testing correction and power estimation for millions of correlated markers. PLoS genetics 5: e1000456.

36. ConneelyKN, BoehnkeM (2007) So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests. American journal of human genetics 81: 1158–1168.

37. ZaitlenN, PasaniucB, GurT, ZivE, HalperinE (2010) Leveraging genetic variability across populations for the identification of causal variants. American journal of human genetics 86: 23–33.

38. LiuDC, NocedalJ (1989) On the limited memory bfgs method for large scale optimization. Mathematical programming 45: 503–528.

39. Johnson SG (2010) The nlopt nonlinear-optimization package.

40. Su Z, Marchini J, Donnelly P (2011) Hapgen2: simulation of multiple disease snps. Bioinformatics.

41. YangJ, FerreiraT, MorrisAP, MedlandSE, MaddenPA, et al. (2012) Conditional and joint multiple-snp analysis of gwas summary statistics identifies additional variants influencing complex traits. Nature genetics 44: 369–375.

42. ThurmanRE, RynesE, HumbertR, VierstraJ, MauranoMT, et al. (2012) The accessible chromatin landscape of the human genome. Nature 489: 75–82.

43. Teslovich et alTM (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466: 707–713.

44. ShifmanS, KuypersJ, KokorisM, YakirB, DarvasiA (2003) Linkage disequilibrium patterns of the human genome across populations. Human molecular genetics 12: 771–776.

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2014 Číslo 10
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#