A Flexible Approach for the Analysis of Rare Variants Allowing for a Mixture of Effects on Binary or Quantitative Traits
Multiple rare variants either within or across genes have been hypothesised to collectively influence complex human traits. The increasing availability of high throughput sequencing technologies offers the opportunity to study the effect of rare variants on these traits. However, appropriate and computationally efficient analytical methods are required to account for collections of rare variants that display a combination of protective, deleterious and null effects on the trait. We have developed a novel method for the analysis of rare genetic variation in a gene, region or pathway that, by simply aggregating summary statistics at each variant, can: (i) test for the presence of a mixture of effects on a trait; (ii) be applied to both binary and quantitative traits in population-based and family-based data; (iii) adjust for covariates to allow for non-genetic risk factors and; (iv) incorporate imputed genetic variation. In addition, for preliminary identification of promising genes, the method can be applied to association summary statistics, available from meta-analysis of published data, for example, without the need for individual level genotype data. Through simulation, we show that our method is immune to the presence of bi-directional effects, with no apparent loss in power across a range of different mixtures, and can achieve greater power than existing approaches as long as summary statistics at each variant are robust. We apply our method to investigate association of type-1 diabetes with imputed rare variants within genes in the major histocompatibility complex using genotype data from the Wellcome Trust Case Control Consortium.
Vyšlo v časopise:
A Flexible Approach for the Analysis of Rare Variants Allowing for a Mixture of Effects on Binary or Quantitative Traits. PLoS Genet 9(8): e32767. doi:10.1371/journal.pgen.1003694
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1003694
Souhrn
Multiple rare variants either within or across genes have been hypothesised to collectively influence complex human traits. The increasing availability of high throughput sequencing technologies offers the opportunity to study the effect of rare variants on these traits. However, appropriate and computationally efficient analytical methods are required to account for collections of rare variants that display a combination of protective, deleterious and null effects on the trait. We have developed a novel method for the analysis of rare genetic variation in a gene, region or pathway that, by simply aggregating summary statistics at each variant, can: (i) test for the presence of a mixture of effects on a trait; (ii) be applied to both binary and quantitative traits in population-based and family-based data; (iii) adjust for covariates to allow for non-genetic risk factors and; (iv) incorporate imputed genetic variation. In addition, for preliminary identification of promising genes, the method can be applied to association summary statistics, available from meta-analysis of published data, for example, without the need for individual level genotype data. Through simulation, we show that our method is immune to the presence of bi-directional effects, with no apparent loss in power across a range of different mixtures, and can achieve greater power than existing approaches as long as summary statistics at each variant are robust. We apply our method to investigate association of type-1 diabetes with imputed rare variants within genes in the major histocompatibility complex using genotype data from the Wellcome Trust Case Control Consortium.
Zdroje
1. GudbjartssonDF, WaltersGB, ThorleifssonG, StefanssonH, HalldorssonBV, et al. (2008) Many sequence variants affecting diversity of adult human height. Nat Genet 40: 609–615.
2. LettreG, JacksonAU, GiegerC, SchumacherFR, BerndtSI, et al. (2008) Identification of ten loci associated with height highlights new biological pathways in human growth. Nat Genet 40: 584–591.
3. WeedonMN, LangoH, LindgrenCM, WallaceC, EvansDM, et al. (2008) Genome-wide association analysis identifies 20 loci that influence adult height. Nature genetics 40: 575–583.
4. Lango AllenH, EstradaK, LettreG, BerndtSI, WeedonMN, et al. (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467: 832–838.
5. A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073.
6. CohenJC, KissRS, PertsemlidisA, MarcelYL, McPhersonR, et al. (2004) Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305: 869–872.
7. MorgenthalerS, ThillyWG (2007) A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutation research 615: 28–56.
8. LiB, LealSM (2008) Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. American journal of human genetics 83: 311–321.
9. MadsenBE, BrowningSR (2009) A groupwise association test for rare mutations using a weighted sum statistic. PLoS genetics 5: e1000384.
10. MorrisAP, ZegginiE (2010) An evaluation of statistical approaches to rare variant analysis in genetic association studies. Genet Epidemiol 34: 188–193.
11. PriceAL, KryukovGV, de BakkerPI, PurcellSM, StaplesJ, et al. (2010) Pooled association tests for rare variants in exon-resequencing studies. American journal of human genetics 86: 832–838.
12. ZeltermanD, ChenCF (1988) Homogeneity Tests against Central-Mixture Alternatives. Journal of the American Statistical Association 83: 179–182.
13. NeymanJ, ScottE (1966) On the use of c(α) optimal tests of composite hypotheses. Bulletin of the International Statistical Institute 41: 477–497.
14. NealeBM, RivasMA (2011) Testing for an unusual distribution of rare variants. PLoS Genet 7.
15. WuMC, LeeS, CaiT, LiY, BoehnkeM, et al. (2011) Rare-variant association testing for sequencing data with the sequence kernel association test. American journal of human genetics 89: 82–93.
16. LeeS, WuMC, LinX (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13: 762–775.
17. BarrettJC, ClaytonDG, ConcannonP, AkolkarB, CooperJD, et al. (2009) Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nature genetics 41: 703–707.
18. HuangJ, EllinghausD, FrankeA, HowieB, LiY (2012) 1000 Genomes-based imputation identifies novel and refined associations for the Wellcome Trust Case Control Consortium phase 1 Data. European Journal of Human Genetics 2012;20: 801–805 doi: 10.1038/ejhg.2012.3
19. LiY, ByrnesAE, LiM (2010) To identify associations with rare variants, just WHaIT: Weighted haplotype and imputation-based tests. American journal of human genetics 87: 728–735.
20. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447: 661–678.
21. CochranWG (1952) The Chi-2 Test of Goodness of Fit. Annals of Mathematical Statistics 23: 315–345.
22. DaviesR (1980) The distribution of a linear combination of chi-square random variables. J R Stat Soc Ser C Appl Stat 29: 323–333.
23. BodmerW, BonillaC (2008) Common and rare variants in multifactorial susceptibility to common diseases. Nature genetics 40: 695–701.
24. MagiR, AsimitJL, Day-WilliamsAG, ZegginiE, MorrisAP (2012) Genome-Wide Association Analysis of Imputed Rare Variants: Application to Seven Common Complex Diseases. Genet Epidemiol 2012 Sep 5. doi: 10.1002/gepi.21675
25. Finishing the euchromatic sequence of the human genome. Nature 431: 931–945.
26. FengT, ZhuX (2010) Genome-wide searching of rare genetic variants in WTCCC data. Human genetics 128: 269–280.
27. NobleJA, ValdesAM, CookM, KlitzW, ThomsonG, et al. (1996) The role of HLA class II genes in insulin-dependent diabetes mellitus: molecular analysis of 180 Caucasian, multiplex families. American journal of human genetics 59: 1134–1148.
28. SheJX (1996) Susceptibility to type I diabetes: HLA-DQ and DR revisited. Immunology today 17: 323–329.
29. ToddJA (1995) Genetic analysis of type 1 diabetes using whole genome approaches. Proceedings of the National Academy of Sciences of the United States of America 92: 8560–8565.
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2013 Číslo 8
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Chromosomal Copy Number Variation, Selection and Uneven Rates of Recombination Reveal Cryptic Genome Diversity Linked to Pathogenicity
- Genome-Wide DNA Methylation Analysis of Systemic Lupus Erythematosus Reveals Persistent Hypomethylation of Interferon Genes and Compositional Changes to CD4+ T-cell Populations
- Associations of Mitochondrial Haplogroups B4 and E with Biliary Atresia and Differential Susceptibility to Hydrophobic Bile Acid
- A Role for CF1A 3′ End Processing Complex in Promoter-Associated Transcription