Comparison of Methods to Account for Relatedness in Genome-Wide Association Studies with Family-Based Data
Recently, statistical approaches known as linear mixed models (LMMs) have become popular for analysing data from genome-wide association studies. In the last few years, a bewildering variety of different LMM methods/software packages have been developed, but it has not always been clear how (or indeed whether) any newly-proposed method differs from previously-proposed implementations. Here we compare the performance of several different LMM approaches (and software implementations) via their application to a genome-wide association study of visceral leishmaniasis in 348 Brazilian families comprising 3626 individuals. We also compare the LMM results to those obtained using alternative analysis methods. Overall, we find strong concordance between the results from the different LMM approaches and high correlation between the results from LMMs and most alternative approaches. We conclude that LMM approaches perform well in comparison to competing approaches and, in most applications, the precise LMM implementation will not be too important, and can be chosen on the basis of speed or convenience.
Vyšlo v časopise:
Comparison of Methods to Account for Relatedness in Genome-Wide Association Studies with Family-Based Data. PLoS Genet 10(7): e32767. doi:10.1371/journal.pgen.1004445
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1004445
Souhrn
Recently, statistical approaches known as linear mixed models (LMMs) have become popular for analysing data from genome-wide association studies. In the last few years, a bewildering variety of different LMM methods/software packages have been developed, but it has not always been clear how (or indeed whether) any newly-proposed method differs from previously-proposed implementations. Here we compare the performance of several different LMM approaches (and software implementations) via their application to a genome-wide association study of visceral leishmaniasis in 348 Brazilian families comprising 3626 individuals. We also compare the LMM results to those obtained using alternative analysis methods. Overall, we find strong concordance between the results from the different LMM approaches and high correlation between the results from LMMs and most alternative approaches. We conclude that LMM approaches perform well in comparison to competing approaches and, in most applications, the precise LMM implementation will not be too important, and can be chosen on the basis of speed or convenience.
Zdroje
1. KangHM, SulJH, ServiceSK, ZaitlenNA, KongSY, et al. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet 42: 348–354.
2. ZhangZ, ErsozE, LaiCQ, TodhunterJR, TiwariHK, et al. (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet 42: 355–360.
3. SawcerS, HellenthalG, PirinenM, SpencerCC, PatsopoulosNA, et al. (2011) Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature 476: 214–219.
4. LippertC, ListgartenJ, LiuY, KadieCM, DavidsonRI, et al. (2011) FaST linear mixed models for genome-wide association studies. Nature Methods 8: 833–835.
5. FisherR (1918) The correlation between relatives on the supposition of Mendelian inheritance. Trans R Soc Edin 52: 399–433.
6. HendersonCR (1953) Estimation of variance and covariance components. Biometrics 9: 226–252.
7. BoerwinkleE, ChakrabortyR, SingCF (1986) The use of measured genotype information in the analysis of quantitative phenotypes in man. I. Models and analytical methods. Ann Hum Genet 50: 181–94.
8. AbneyM, OberC, McPeekMS (2002) Quantitative-trait homozygosity and association mapping and empirical genomewide significance in large, complex pedigrees: fasting serum-insulin level in the Hutterites. Am J Hum Genet 70: 920–934.
9. ChenWM, AbecasisGR (2007) Family-based association tests for genomewide association scans. Am J Hum Genet 81: 913–926.
10. AulchenkoYS, de KoningDJ, HaleyC (2007) Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177: 577–585.
11. YuJ, PressoirG, BriggsWH, Vroh BiI, YamasakiM, et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38: 203–208.
12. AminN, van DuijnCM, AulchenkoYS (2007) A genomic background based method for association analysis in related individuals. PLoS One 2: e1274.
13. FakiolaM, StrangeA, CordellHJ, MillerEN, PirinenM, et al. (2013) Common variants in the HLA-DRB1-HLA-DQA1 HLA class II region are associated with susceptibility to visceral leishmaniasis. Nat Genet 45: 208–213.
14. SvishchevaGR, AxenovichTI, BelonogovaNM, van DuijnCM, AulchenkoYS (2012) Rapid variance components-based method for whole-genome association analysis. Nat Genet 44: 1166–1170.
15. PirinenM, DonnellyP, SpencerC (2013) Efficient computation with a linear mixed model on large-scale data sets with applications to genetic studies. Annals of Applied Statistics 7: 369–390.
16. ZhouX, StephensM (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44: 821–824.
17. Almasy L, Dyer TD, Peralta JM, Jun G, Wood AR, et al.. (2014) Data for Genetic Analysis Workshop 18: Human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees. Genet Epidemiol in press.
18. Eu-ahsunthornwattana J, Howey RAJ, Cordell HJ (2014) Accounting for relatedness in family-based association studies: application to GAW18 data. BMC Proceedings 8 (Suppl 1):S79.
19. SpielmanRS, McGinnisRE, EwensWJ (1993) Transmission test for linkage disequilibrium: The insulin gene region and insulin–dependent diabetes mellitus. Am J Hum Genet 52: 506–516.
20. RabinowitzD, LairdNM (2000) A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information. Hum Hered 50: 211–223.
21. Laird NM, Horvath S, Xu X (2000) Implementing a unified approach to family based tests of association. Genet Epidemiol Suppl 19: S36–S42.
22. LakeSL, BlackerDB, LairdNM (2000) Family-based tests of association in the presence of linkage. Am J Hum Genet 67: 1515–1525.
23. HorvathS, XuX, LairdNM (2001) The family based association test method: strategies for studying general genotype–phenotype associations. Eur J Hum Genet 9: 301–306.
24. ThorntonT, McPeekMS (2007) Case-control association testing with related individuals: a more powerful quasi-likelihood score test. Am J Hum Genet 81: 321–337.
25. JakobsdottirJ, McPeekMS (2013) MASTOR: Mixed-Model Association Mapping of Quantitative Traits in Samples with Related Individuals. Am J Hum Genet 92: 652–666.
26. ThorntonT, McPeekMS (2010) ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structure. Am J Hum Genet 86: 172–184.
27. PurcellS, NealeB, Todd-BrownK, ThomasL, FerreiraMA, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575.
28. ManichaikulA, MychaleckyjJC, RichSS, DalyK, SaleM, et al. (2010) Robust relationship inference in genome-wide association studies. Bioinformatics 26: 2867–2873.
29. DevlinB, RoederK (1999) Genomic control for association studies. Biometrics 55: 997–1004.
30. ListgartenJ, LippertC, KadieCM, DavidsonRI, EskinE, et al. (2012) Improved linear mixed models for genome-wide association studies. Nature Methods 9: 525–526.
31. LippertC, QuonG, KangEY, KadieCM, ListgartenJ, et al. (2013) The benefits of selecting phenotype-specific variants for applications of mixed models in genomics. Sci Rep 3: 1815.
32. EttingerNA, DuggalP, BrazRF, NascimentoET, BeatyTH, et al. (2009) Genetic admixture in Brazilians exposed to infection with Leishmania chagasi. Ann Hum Genet 73: 304–313.
33. AlexanderDH, NovembreJ, LangeK (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Research 19: 1655–1664.
34. FurlotteNA, EskinE, EyheramendyS (2012) Genome-wide association mapping with longitudinal data. Genet Epidemiol 36: 463–471.
35. LangeK, PappJC, SinsheimerJS, SriprachaR, ZhouH, et al. (2013) Mendel: the Swiss army knife of genetic analysis programs. Bioinformatics 29: 1568–1570.
36. AstleW, BaldingDJ (2009) Population structure and cryptic relatedness in genetic association studies. Statistical Science 24: 451–471.
37. SpeedD, HemaniG, JohnsonMR, JBD (2012) Improved heritability estimation from genome-wide SNPs. Am J Hum Genet 91: 1011–1021.
38. WangK, HuX, PengY (2013) An analytical comparison of the principal component method and the mixed effects model for association studies in the presence of cryptic relatedness and population stratification. Hum Hered 76: 1–9.
39. AulchenkoYS, RipkeS, IsaacsA, van DuijnCM (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23: 1294–1296.
40. AbecasisGR, CherneySS, CooksonWO, CardonLR (2002) Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30: 97–101.
41. KangHM, ZaitlenNA, WadeCM, KirbyA, HeckermanD, et al. (2008) Efficient control of population structure in model organism association mapping. Genetics 178: 1709–1723.
42. MarchiniJ, HowieB, MyersS, McVeanG, DonnellyP (2007) A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 39: 906–913.
43. MartinER, MonksSA, WarrenLL, KaplanNL (2000) A test for linkage and association in general pedigrees: the pedigree disequilibrium test. Am J Hum Genet 67: 147–154.
44. LangeC, DeMeoD, SilvermanEK, WeissST, LairdNM (2004) PBAT: tools for family-based association studies. Am J Hum Genet 74: 367–369.
45. DudbridgeF (2008) Likelihood-based association analysis for nuclear families and unrelated subjects with missing genotype data. Hum Hered 66: 87–98.
46. DudbridgeF, HolmansPA, WilsonSG (2011) A flexible model for association analysis in sibships with missing genotype data. Ann Hum Genet 75: 428–438.
47. PowellJE, VisscherP, GoddardME (2010) Reconciling the analysis of IBD and IBS in complex trait studies. Nat Rev Genet 11: 800–805.
48. YangJ, LeeSH, GoddardME, VisscherPM (2011) GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88: 76–82.
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2014 Číslo 7
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Wnt Signaling Interacts with Bmp and Edn1 to Regulate Dorsal-Ventral Patterning and Growth of the Craniofacial Skeleton
- Novel Approach Identifies SNPs in and with Evidence for Parent-of-Origin Effect on Body Mass Index
- Hypoxia Adaptations in the Grey Wolf () from Qinghai-Tibet Plateau
- DNA Topoisomerase 1α Promotes Transcriptional Silencing of Transposable Elements through DNA Methylation and Histone Lysine 9 Dimethylation in