#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

The Genetic Interpretation of Area under the ROC Curve in Genomic Profiling


Genome-wide association studies in human populations have facilitated the creation of genomic profiles which combine the effects of many associated genetic variants to predict risk of disease. The area under the receiver operator characteristic (ROC) curve is a well established measure for determining the efficacy of tests in correctly classifying diseased and non-diseased individuals. We use quantitative genetics theory to provide insight into the genetic interpretation of the area under the ROC curve (AUC) when the test classifier is a predictor of genetic risk. Even when the proportion of genetic variance explained by the test is 100%, there is a maximum value for AUC that depends on the genetic epidemiology of the disease, i.e. either the sibling recurrence risk or heritability and disease prevalence. We derive an equation relating maximum AUC to heritability and disease prevalence. The expression can be reversed to calculate the proportion of genetic variance explained given AUC, disease prevalence, and heritability. We use published estimates of disease prevalence and sibling recurrence risk for 17 complex genetic diseases to calculate the proportion of genetic variance that a test must explain to achieve AUC = 0.75; this varied from 0.10 to 0.74. We provide a genetic interpretation of AUC for use with predictors of genetic risk based on genomic profiles. We provide a strategy to estimate proportion of genetic variance explained on the liability scale from estimates of AUC, disease prevalence, and heritability (or sibling recurrence risk) available as an online calculator.


Vyšlo v časopise: The Genetic Interpretation of Area under the ROC Curve in Genomic Profiling. PLoS Genet 6(2): e32767. doi:10.1371/journal.pgen.1000864
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1000864

Souhrn

Genome-wide association studies in human populations have facilitated the creation of genomic profiles which combine the effects of many associated genetic variants to predict risk of disease. The area under the receiver operator characteristic (ROC) curve is a well established measure for determining the efficacy of tests in correctly classifying diseased and non-diseased individuals. We use quantitative genetics theory to provide insight into the genetic interpretation of the area under the ROC curve (AUC) when the test classifier is a predictor of genetic risk. Even when the proportion of genetic variance explained by the test is 100%, there is a maximum value for AUC that depends on the genetic epidemiology of the disease, i.e. either the sibling recurrence risk or heritability and disease prevalence. We derive an equation relating maximum AUC to heritability and disease prevalence. The expression can be reversed to calculate the proportion of genetic variance explained given AUC, disease prevalence, and heritability. We use published estimates of disease prevalence and sibling recurrence risk for 17 complex genetic diseases to calculate the proportion of genetic variance that a test must explain to achieve AUC = 0.75; this varied from 0.10 to 0.74. We provide a genetic interpretation of AUC for use with predictors of genetic risk based on genomic profiles. We provide a strategy to estimate proportion of genetic variance explained on the liability scale from estimates of AUC, disease prevalence, and heritability (or sibling recurrence risk) available as an online calculator.


Zdroje

1. McCarthyMI

AbecasisGR

CardonLR

GoldsteinDB

LittleJ

2008 Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics 9 356 369

2. IlesMM

2008 What can genome-wide association studies tell us about the genetics of common disease? PLoS Genet 4 e33 doi:10.1371/journal.pgen.0040033

3. JanssensAC

AulchenkoYS

ElefanteS

BorsboomGJ

SteyerbergEW

2006 Predictive testing for complex diseases using multiple genes: fact or fiction? Genet Med 8 395 400

4. WrayNR

GoddardME

VisscherPM

2007 Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res 17 1520 1528

5. KraftP

WacholderS

CornelisMC

HuFB

HayesRB

2009 OPINION Beyond odds - ratios communicating disease risk based on genetic profiles. Nature Reviews Genetics 10 264 269

6. JakobsdottirJ

GorinMB

ConleyYP

FerrellRE

WeeksDE

2009 Interpretation of genetic association studies: markers with replicated highly significant odds ratios may be poor classifiers. PLoS Genet 5 e1000337 doi:10.1371/journal.pgen.1000337

7. MetzCE

1978 Basic principles of ROC analysis. Seminars in Nuclear Medicine 8 283 298

8. LuQ

ElstonRC

2008 Using the optimal receiver operating characteristic curve to design a predictive genetic test, exemplified with type 2 diabetes. American Journal of Human Genetics 82 641 651

9. van der NetJB

JanssensA

DefescheJC

KasteleinJJP

SijbrandsEJG

2009 Usefulness of Genetic Polymorphisms and Conventional Risk Factors to Predict Coronary Heart Disease in Patients With Familial Hypercholesterolemia. American Journal of Cardiology 103 375 380

10. GrosseSD

KhouryMJ

2006 What is the clinical utility of genetic testing? Genet Med 8 448 450

11. FalconerD

MackayT

1996 Introduction to Quantitative Genetics. England Longman 464

12. JamesJW

1971 Frequency in relatives for an all-or-none trait. Ann Hum Genet 35 47 49

13. DempsterER

LernerIM

1950 Heritability of Threshold Characters. Genetics 35 212 236

14. LynchM

WalshB

1998 Genetics and Analysis of Quantitative Traits. Sunderland, Massachusetts Sinauer Associates, Inc

15. RobertsonA

LernerIM

1949 The heritability of all-or-none traits - viability of poultry. Genetics 34 395 411

16. ReichT

JamesJW

MorrisCA

1972 The use of multiple thresholds in determining the mode of transmission of semi-continuous traits. Ann Hum Genet 36 163 184

17. SomersRH

1962 A new asymmetric measure of association for ordinal variables. American Sociological Review 27 799 811

18. HanleyJ

McNeilB

1982 The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 143

19. YangJ

VisscherPM

WrayNR

2009 Sporadic cases are the norm for common disease. European Journal of Human Genetics 2009 Oct 14. [Epub ahead of print]

20. JanssensAC

MoonesingheR

YangQ

SteyerbergEW

van DuijnCM

2007 The impact of genotype frequencies on the clinical validity of genomic profiling for predicting common chronic diseases. Genet Med 9 528 535

21. SchollHPN

FleckensteinM

IssaPC

KeilhauerC

HolzFG

2007 An update on the genetics of age-related macular degeneration. Molecular Vision 13 196 205

22. SeddonJM

CoteJ

PageWF

AggenSH

NealeMC

2005 The US twin study of age-related macular degeneration - Relative roles of genetic and einivironmental influences. Archives of Ophthalmology 123 321 327

23. GuJ

PauerGJ

YueX

NarendraU

SturgillGM

2009 Assessing susceptibility to age-related macular degeneration with proteomic and genomic biomarkers. Mol Cell Proteomics 8 1338 1349

24. ClaytonDG

2009 Prediction and interaction in complex disease genetics: experience in type 1 diabetes. PLoS Genet 5 e1000540 doi:10.1371/journal.pgen.1000540

25. RischN

1990 Linkage strategies for genetically complex traits. I. Multilocus models. Am J Hum Genet 46 222 228

26. SlatkinM

2008 Exchangeable models of complex inherited diseases. Genetics 179 2253 2261

27. WrayNR

GoddardME

2010 Multi-locus models of genetic risk of disease. Genome Medicine In press

28. MaherB

2008 Personal genomes: The case of the missing heritability. Nature 456 18 21

29. BhangaleTR

RiederMJ

NickersonDA

2008 Estimating coverage and power for genetic association studies using near-complete variation data. Nature Genetics 40 841 843

30. RedonR

IshikawaS

FitchKR

FeukL

PerryGH

2006 Global variation in copy number in the human genome. Nature 444 444 454

31. YoungsonNA

WhitelawE

2008 Transgenerational epigenetic effects. Annual Review of Genomics and Human Genetics 9 233 257

32. BakerSG

CookNR

VickersA

KramerBS

2009 Using relative utility curves to evaluate risk prediction. Journal of the Royal Statistical Society 172 729 748

33. LevinsonDF

2006 The genetics of depression: A review. Biological Psychiatry 60 84 92

34. SullivanPF

NealeMC

KendlerKS

2000 Genetic epidemiology of major depression: Review and meta-analysis. American Journal of Psychiatry 157 1552 1562

35. MarenbergME

RischN

BerkmanLF

FloderusB

DefaireU

1994 Genetic susceptibility to death from coronary heart disease in a study of twins. New England Journal of Medicine 330 1041 1046

36. RischN

2001 The genetic epidemiology of cancer: interpreting family and twin studies and their implications for molecular genetic approaches. Cancer Epidemiol Biomarkers Prev 10 733 741

37. DasSK

ElbeinSC

2006 The Genetic Basis of Type 2 Diabetes. Cellscience 2 100 131

38. HemminkiK

LiX

SundquistK

SundquistJ

2007 Familial risks for asthma among twins and other siblings based on hospitalizations in Sweden. Clinical and Experimental Allergy 37 1320 1325

39. CraddockN

KhodelV

Van EerdeweghP

ReichT

1995 Mathematical limits of multilocus models: the genetic transmission of bipolar disorder. Am J Hum Genet 57 690 702

40. LichtensteinP

YipBH

BjorkC

PawitanY

CannonTD

2009 Common genetic determinants of schizophrenia and bipolar disorder in Swedish families: a population-based study. Lancet 373 234 239

41. McGueM

GottesmanII

RaoDC

1983 The transmission of schizophrenia under a multifactorial threshold model. American Journal of Human Genetics 35 1161 1178

42. HarneyS

WordsworthBP

2002 Genetic epidemiology of rheumatoid arthritis. Tissue Antigens 60 465 473

43. HyttinenV

KaprioJ

KinnunenL

KoskenvuoM

TuomilehtoJ

2003 Genetic liability of type 1 diabetes and the onset age among 22,650 young Finnish twin pairs - A nationwide follow-up study. Diabetes 52 1052 1055

44. WTCCC 2007 Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447 661 678

45. HarleyJB

Alarcon-RiquelmeME

CriswellLA

JacobCO

KimberlyRP

2008 Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci. Nat Genet 40 204 210

46. SingT

SanderO

BeerenwinkelN

LengauerT

2005 ROCR: visualizing classifier performance in R. Bioinformatics 21 3940 3941

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2010 Číslo 2
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#