#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

An Evolutionary Framework for Association Testing in Resequencing Studies


Sequencing technologies are becoming cheap enough to apply to large numbers of study participants and promise to provide new insights into human phenotypes by bringing to light rare and previously unknown genetic variants. We develop a new framework for the analysis of sequence data that incorporates all of the major features of previously proposed approaches, including those focused on allele counts and allele burden, but is both more general and more powerful. We harness population genetic theory to provide prior information on effect sizes and to create a pooling strategy for information from rare variants. Our method, EMMPAT (Evolutionary Mixed Model for Pooled Association Testing), generates a single test per gene (substantially reducing multiple testing concerns), facilitates graphical summaries, and improves the interpretation of results by allowing calculation of attributable variance. Simulations show that, relative to previously used approaches, our method increases the power to detect genes that affect phenotype when natural selection has kept alleles with large effect sizes rare. We demonstrate our approach on a population-based re-sequencing study of association between serum triglycerides and variation in ANGPTL4.


Vyšlo v časopise: An Evolutionary Framework for Association Testing in Resequencing Studies. PLoS Genet 6(11): e32767. doi:10.1371/journal.pgen.1001202
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1001202

Souhrn

Sequencing technologies are becoming cheap enough to apply to large numbers of study participants and promise to provide new insights into human phenotypes by bringing to light rare and previously unknown genetic variants. We develop a new framework for the analysis of sequence data that incorporates all of the major features of previously proposed approaches, including those focused on allele counts and allele burden, but is both more general and more powerful. We harness population genetic theory to provide prior information on effect sizes and to create a pooling strategy for information from rare variants. Our method, EMMPAT (Evolutionary Mixed Model for Pooled Association Testing), generates a single test per gene (substantially reducing multiple testing concerns), facilitates graphical summaries, and improves the interpretation of results by allowing calculation of attributable variance. Simulations show that, relative to previously used approaches, our method increases the power to detect genes that affect phenotype when natural selection has kept alleles with large effect sizes rare. We demonstrate our approach on a population-based re-sequencing study of association between serum triglycerides and variation in ANGPTL4.


Zdroje

1. MaherB

2008 Personal genomes: The case of the missing heritability. Nature 456 18 21

2. PritchardJK

CoxNJ

2002 The allelic architecture of human disease genes: common disease-common variant… or not? Hum Mol Genet 11 2417 2423

3. PritchardJK

2001 Are rare variants responsible for susceptibility to complex diseases? American Journal of Human Genetics 69 124137

4. ManolioTA

CollinsFS

CoxNJ

GoldsteinDB

HindorffLA

2009 Finding the missing heritability of complex diseases. Nature 461 747 753

5. Eyre-WalkerA

2010 Evolution in health and medicine sackler colloquium: Genetic architecture of a complex trait and its implications for fitness and genome-wide association studies. Proceedings of the National Academy of Sciences 107 1752 1756

6. GorlovIP

GorlovaOY

SunyaevSR

SpitzMR

AmosCI

2008 Shifting paradigm of association studies: Value of rare Single-Nucleotide polymorphisms. American Journal of Human Genetics 82 100112

7. LiB

LealSM

2009 Discovery of rare variants via sequencing: Implications for the design of complex trait association studies. PLoS Genet 5 e1000481 doi:10.1371/journal.pgen.1000481

8. RomeoS

YinW

KozlitinaJ

PennacchioLA

BoerwinkleE

2009 Rare loss-of-function mutations in ANGPTL family members contribute to plasma triglyceride levels in humans. The Journal of Clinical Investigation 119 70 79

9. Paisn-RuizC

WasheckaN

NathP

SingletonAB

CorderEH

2009 Parkinson's disease and low frequency alleles found together throughout LRRK2. Annals of Human Genetics 73 391 403

10. CohenJC

PertsemlidisA

FahmiS

EsmailS

VegaGL

2006 Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels. Proceedings of the National Academy of Sciences of the United States of America 103 1810 1815

11. CohenJC

BoerwinkleE

MosleyTH

HobbsHH

2006 Sequence variations in PCSK9, low LDL, and protection against coronary heart disease. N Engl J Med 354 1264 1272

12. RomeoS

PennacchioLA

FuY

BoerwinkleE

Tybjaerg-HansenA

2007 Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nat Genet 39 513 516

13. KotowskiIK

PertsemlidisA

LukeA

CooperRS

VegaGL

2006 A spectrum of PCSK9 alleles contributes to plasma levels of Low-Density lipoprotein cholesterol. The American Journal of Human Genetics 78 410 422

14. CohenJC

KissRS

PertsemlidisA

MarcelYL

McPhersonR

2004 Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305 869 872

15. WangJ

CaoH

BanMR

KennedyBA

ZhuS

2007 Resequencing genomic DNA of patients with severe hypertriglyceridemia (MIM 144650). Arterioscler Thromb Vasc Biol 27 2450 2455

16. KryukovGV

ShpuntA

StamatoyannopoulosJA

SunyaevSR

2009 Power of deep, all-exon resequencing for discovery of human trait genes. Proceedings of the National Academy of Sciences 106 3871 3876

17. RoachJC

GlusmanG

SmitAFA

HuffCD

HubleyR

2010 Analysis of genetic inheritance in a family quartet by Whole-Genome sequencing. Science 328 636 639

18. HoggartCJ

WhittakerJC

IorioMD

BaldingDJ

2008 Simultaneous analysis of all SNPs in Genome-Wide and Re-Sequencing association studies. PLoS Genet 4 e1000130 doi:10.1371/journal.pgen.1000130

19. KweeLC

LiuD

LinX

GhoshD

EpsteinMP

2008 A powerful and flexible multilocus association test for quantitative traits. American Journal of Human Genetics 82 386 397

20. LiB

LealS

2008 Methods for detecting associations with rare variants for common diseases: Application to analysis of sequence data. The American Journal of Human Genetics 83 311 321

21. MadsenBE

BrowningSR

2009 A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet 5 e1000384 doi:10.1371/journal.pgen.1000384

22. BartonNH

KeightleyPD

2002 Understanding quantitative genetic variation. Nat Rev Genet 3 11 21

23. JohnsonT

BartonN

2005 Theoretical models of selection and mutation on quantitative traits. Philosophical Transactions of the Royal Society B: Biological Sciences 360 1411 1425

24. HartlDL

ClarkAG

ClarkAG

1997 Principles of population genetics. Sinauer Sunderland, MA, USA

25. Eyre-WalkerA

WoolfitM

PhelpsT

2006 The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173 891 900

26. Eyre-WalkerA

KeightleyPD

2007 The distribution of fitness effects of new mutations. Nat Rev Genet 8 610 618

27. KeightleyPD

Eyre-WalkerA

2007 Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies. Genetics 177 2251 2261

28. WelchJJ

Eyre-WalkerA

WaxmanD

2008 Divergence and polymorphism under the nearly neutral theory of molecular evolution. Journal of Molecular Evolution 67 418 426

29. KryukovGV

PennacchioLA

SunyaevSR

2007 Most rare missense alleles are deleterious in humans: Implications for complex disease and association studies. American Journal of Human Genetics 80 727739

30. YampolskyLY

KondrashovFA

KondrashovAS

2005 Distribution of the strength of selection against amino acid replacements in human proteins. Hum Mol Genet 14 3191 3201

31. GutenkunstRN

HernandezRD

WilliamsonSH

BustamanteCD

2009 Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet 5 e1000695 doi:10.1371/journal.pgen.1000695

32. NielsenR

HubiszMJ

HellmannI

TorgersonD

AndrésAM

2009 Darwinian and demographic forces affecting human protein coding genes. Genome Research 19 838 849

33. BoykoAR

WilliamsonSH

IndapAR

DegenhardtJD

HernandezRD

2008 Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet 4 e1000083 doi:10.1371/journal.pgen.1000083

34. TorgersonDG

BoykoAR

HernandezRD

IndapA

HuX

2009 Evolutionary processes acting on candidate cis-Regulatory regions in humans inferred from patterns of polymorphism and divergence. PLoS Genet 5 e1000592 doi:10.1371/journal.pgen.1000592

35. ZollnerS

WenX

PritchardJK

2005 Association mapping and fine mapping with TreeLD. Bioinformatics 21 3168 3170

36. HernandezRD

2008 A flexible forward simulator for populations subject to selection and demography. Bioinformatics 24 2786 2787

37. McCullochCE

SearleSR

2000 Generalized, Linear, and Mixed Models Hoboken, NJ, USA John Wiley & Sons, Inc

38. WedderburnRWM

1974 Quasi-likelihood functions, generalized linear models, and the Gauss–Newton method. Biometrika 61 439 447

39. HeydeCC

1997 Quasi-likelihood and its application Springer 236

40. LittelRC

MillikenGA

StroupWW

WolfingerRD

1996 SAS system for mixed models SAS Inst

41. R Development Core Team 2009 R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org. ISBN 3-900051-07-0

42. VictorRG

HaleyRW

WillettDL

PeshockRM

VaethPC

2004 The dallas heart study: a population-based probability sample for the multidisciplinary study of ethnic differences in cardiovascular health. The American Journal of Cardiology 93 1473 1480

43. BrowningJD

SzczepaniakLS

DobbinsR

NurembergP

HortonJD

2004 Prevalence of hepatic steatosis in an urban population in the united states: impact of ethnicity. Hepatology (Baltimore, Md) 40 1387 1395

44. hon YauM

WangY

LamKSL

ZhangJ

WuD

2009 A highly conserved motif within the NH2-terminal coiled-coil domain of angiopoietin-like protein 4 confers its inhibitory effects on lipoprotein lipase by disrupting the enzyme dimerization. The Journal of Biological Chemistry 284 11942 11952

45. YinW

RomeoS

ChangS

GrishinNV

HobbsHH

2009 Genetic variation in ANGPTL4 provides insights into protein processing and function. The Journal of Biological Chemistry 284 13213 13222

46. MorgenthalerS

ThillyWG

2007 A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutation Research 615 28 56

47. ZengK

ManoS

ShiS

WuC

2007 Comparisons of site- and Haplotype-Frequency methods for detecting positive selection. Mol Biol Evol 24 1562 1574

48. PickrellJK

CoopG

NovembreJ

KudaravalliS

LiJZ

2009 Signals of recent positive selection in a worldwide sample of human populations. Genome Research 19 826 837

49. VoightBF

KudaravalliS

WenX

PritchardJK

2006 A map of recent positive selection in the human genome. PLoS Biol 4 e72 doi:10.1371/journal.pbio.1000072

50. PenningsPS

HermissonJ

2006 Soft sweeps III: the signature of positive selection from recurrent mutation. PLoS Genet 2 e186 doi:10.1371/journal.pgen.0020186

51. PritchardJK

PickrellJK

CoopG

2010 The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation. Current Biology: CB 20 R208 215

52. AhituvN

KavaslarN

SchackwitzW

UstaszewskaA

MartinJ

2007 Medical sequencing at the extremes of human body mass. American Journal of Human Genetics 80 779 791

53. NeuhausJM

JewellNP

1990 The effect of retrospective sampling on binary regression models for clustered data. Biometrics 46 977 990

54. BartonNH

TurelliM

2004 Effects of genetic drift on variance components under a general model of epistasis. Evolution; International Journal of Organic Evolution 58 2111 2132

55. HillWG

GoddardME

VisscherPM

2008 Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet 4 e1000008 doi:10.1371/journal.pgen.1000008

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2010 Číslo 11
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#