Genetic Variants and Their Interactions in the Prediction of Increased Pre-Clinical Carotid Atherosclerosis: The Cardiovascular Risk in Young Finns Study
The relative contribution of genetic risk factors to the progression of subclinical atherosclerosis is poorly understood. It is likely that multiple variants are implicated in the development of atherosclerosis, but the subtle genotypic and phenotypic differences are beyond the reach of the conventional case-control designs and the statistical significance testing procedures being used in most association studies. Our objective here was to investigate whether an alternative approach—in which common disorders are treated as quantitative phenotypes that are continuously distributed over a population—can reveal predictive insights into the early atherosclerosis, as assessed using ultrasound imaging-based quantitative measurement of carotid artery intima-media thickness (IMT). Using our population-based follow-up study of atherosclerosis precursors as a basis for sampling subjects with gradually increasing IMT levels, we searched for such subsets of genetic variants and their interactions that are the most predictive of the various risk classes, rather than using exclusively those variants meeting a stringent level of statistical significance. The area under the receiver operating characteristic curve (AUC) was used to evaluate the predictive value of the variants, and cross-validation was used to assess how well the predictive models will generalize to other subsets of subjects. By means of our predictive modeling framework with machine learning-based SNP selection, we could improve the prediction of the extreme classes of atherosclerosis risk and progression over a 6-year period (average AUC 0.844 and 0.761), compared to that of using conventional cardiovascular risk factors alone (average AUC 0.741 and 0.629), or when combined with the statistically significant variants (average AUC 0.762 and 0.651). The predictive accuracy remained relatively high in an independent validation set of subjects (average decrease of 0.043). These results demonstrate that the modeling framework can utilize the “gray zone” of genetic variation in the classification of subjects with different degrees of risk of developing atherosclerosis.
Vyšlo v časopise:
Genetic Variants and Their Interactions in the Prediction of Increased Pre-Clinical Carotid Atherosclerosis: The Cardiovascular Risk in Young Finns Study. PLoS Genet 6(9): e32767. doi:10.1371/journal.pgen.1001146
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1001146
Souhrn
The relative contribution of genetic risk factors to the progression of subclinical atherosclerosis is poorly understood. It is likely that multiple variants are implicated in the development of atherosclerosis, but the subtle genotypic and phenotypic differences are beyond the reach of the conventional case-control designs and the statistical significance testing procedures being used in most association studies. Our objective here was to investigate whether an alternative approach—in which common disorders are treated as quantitative phenotypes that are continuously distributed over a population—can reveal predictive insights into the early atherosclerosis, as assessed using ultrasound imaging-based quantitative measurement of carotid artery intima-media thickness (IMT). Using our population-based follow-up study of atherosclerosis precursors as a basis for sampling subjects with gradually increasing IMT levels, we searched for such subsets of genetic variants and their interactions that are the most predictive of the various risk classes, rather than using exclusively those variants meeting a stringent level of statistical significance. The area under the receiver operating characteristic curve (AUC) was used to evaluate the predictive value of the variants, and cross-validation was used to assess how well the predictive models will generalize to other subsets of subjects. By means of our predictive modeling framework with machine learning-based SNP selection, we could improve the prediction of the extreme classes of atherosclerosis risk and progression over a 6-year period (average AUC 0.844 and 0.761), compared to that of using conventional cardiovascular risk factors alone (average AUC 0.741 and 0.629), or when combined with the statistically significant variants (average AUC 0.762 and 0.651). The predictive accuracy remained relatively high in an independent validation set of subjects (average decrease of 0.043). These results demonstrate that the modeling framework can utilize the “gray zone” of genetic variation in the classification of subjects with different degrees of risk of developing atherosclerosis.
Zdroje
1. PlominR
HaworthCM
DavisOS
2009 Common disorders are quantitative traits. Opinion. Nat Rev Genet 10 872 878
2. SchorkNJ
NathSK
FallinD
ChakravartiA
2000 Linkage disequilibrium analysis of biallelic DNA markers, human quantitative trait loci, and threshold-defined case and control subjects. Am J Hum Genet 67 1208 1218
3. LanktreeMB
HegeleRA
SchorkNJ
SpenceJD
2010 Extremes of unexplained variation as a phenotype: an efficient approach for genome-wide association studies of cardiovascular disease. Circ Cardiovasc Genet 3 215 221
4. ZhangG
NebertDW
ChakrabortyR
JinL
2006 Statistical power of association using the extreme discordant phenotype design. Pharmacogenet Genomics 16 401 143
5. EguchiT
MaruyamaT
OhnoY
MoriiT
HiraoK
2009 Possible association of tumor necrosis factor receptor 2 gene polymorphism with severe hypertension using the extreme discordant phenotype design. Hypertens Res 32 775 779
6. TorkamaniA
SchorkNJ
2009 Pathway and network analysis with high-density allelic association data. Methods Mol Biol 563 289 301
7. PearsonTA
2002 New tools for coronary risk assessment: what are their advantages and limitations? Circulation 105 886 892
8. KoskinenJ
KähönenM
ViikariJS
TaittonenL
LaitinenT
2009 Conventional cardiovascular risk factors and metabolic syndrome in predicting carotid intima-media thickness progression in young adults: the cardiovascular risk in young Finns study. Circulation 120 229 236
9. SamaniNJ
ErdmannJ
HallAS
HengstenbergC
ManginoM
2007 Genome-wide association analysis of coronary artery disease. N Engl J Med 357 443 453
10. McPhersonR
PertsemlidisA
KavaslarN
StewartA
RobertsR
2007 A common allele on chromosome 9 associated with coronary heart disease. Science 316 1488 1491
11. HelgadottirA
ThorleifssonG
ManolescuA
GretarsdottirS
BlondalT
2007 A common variant on chromosome 9p21 affects the risk of myocardial infarction. Science 316 1491 1493
12. LarsonMG
AtwoodLD
BenjaminEJ
CupplesLA
D'AgostinoRBSr
2007 Framingham Heart Study 100K project: genome-wide associations for cardiovascular disease outcomes. BMC Med Genet 8 S5
13. The Wellcome Trust Case Control Consortium 2007 Genome-wide association study of 14 000 cases of seven common diseases and 3 000 shared control. Nature 447 661 678
14. LukeMM
KaneJP
LiuDM
RowlandCM
ShiffmanD
2007 A polymorphism in the protease-like domain of apolipoprotein(a) is associated with severe coronary artery disease. Arterioscler Thromb Vasc Biol 27 2030 2036
15. WillerCJ
SannaS
JacksonAU
ScuteriA
BonnycastleLL
2008 Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet 40 161 169
16. KathiresanS
MelanderO
AnevskiD
GuiducciC
BurttNP
2008 Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med 358 1240 1249
17. ShiffmanD
KaneJP
LouieJZ
ArellanoAR
RossDA
2008 Analysis of 17,576 potentially functional SNPs in three case-control studies of myocardial infarction. PloS ONE 3 e2895 doi:10.1371/journal.pone.0002895
18. AbdullahKG
LiL
ShenGQ
HuY
YangY
2008 Four SNPs on chromosome 9p21 confer risk to premature, familial CAD and MI in an American Caucasian population (GeneQuest). Annals Human Genet 72 654 657
19. SagooGS
TattI
SalantiG
ButterworthAS
SarwarN
2008 Seven lipoprotein lipase gene polymorphisms, lipid fractions, and coronary disease: a HuGE association review and meta-analysis. Am J Epidemiol 168 1233 1246
20. AndersonJL
HorneBD
KolekMJ
MuhlesteinJB
MowerCP
2008 Genetic variation at the 9p21 locus predicts angiographic coronary artery disease prevalence but not extent and has clinical utility. Am Heart J 156 1155 1162
21. PaynterNP
ChasmanDI
BuringJE
ShiffmanD
CookNR
2009 Cardiovascular disease risk prediction with and without knowledge of genetic variation at chromosome 9p21.3. Ann Intern Med 150 65 72
22. LusisAJ
PajukantaP
2008 A treasure trove for lipoprotein biology. Comment. Nat Genet 40 129 130
23. RaitakariOT
JuonalaM
KähönenM
TaittonenL
LaitinenT
2003 Cardiovascular risk factors in childhood and carotid artery intima-media thickness in adulthood: The Cardiovascular Risk in Young Finns Study. JAMA 2003 290 2277 2283
24. LiS
ChenW
SrinivasanSR
BondMG
TangR
2003 Childhood cardiovascular risk factors and carotid vascular changes in adulthood: The Bogalusa Heart Study. JAMA 290 2271 2276
25. SalonenJT
SalonenR
1991 Ultrasonographically assessed carotid morphology and the risk of coronary heart disease. Arteroscler Thromb 11 1245 1249
26. O'LearyDH
PolakJF
KronmalRA
ManolioTA
BurkeGL
1999 Carotid-artery intima and media thickness as a risk factor for myocardial infarction and stroke in older adults. Cardiovascular Health Study Collaborative Research Group. N Engl J Med 340 14 22
27. LorenzMW
MarkusHS
BotsML
RosvallM
SitzerM
2007 Prediction of clinical cardiovascular events with carotid intima-media thickness: a systematic review and meta-analysis. Circulation 115 459 467
28. O'LearyDH
PolakJF
2002 Intima-media thickness: a tool for atherosclerosis imaging and event prediction. Am J Cardiol 90 18L 21L
29. FrazerKA
MurraySS
SchorkNJ
TopolEJ
2009 Human genetic variation and its contribution to complex traits. Nat Rev Genet 10 241 251
30. MooreJH
WilliamsSM
2009 Epistasis and its implications for personal genetics. Am J Hum Genet 85 309 320
31. MooreJH
AsselbergsFW
WilliamsSM
2010 Bioinformatics challenges for genome-wide association studies. Bioinformatics 26 445 455
32. KraftP
WacholderS
CornelisMC
HuFB
HayesRB
2009 Beyond odds ratios: communicating disease risk based on genetic profiles. Perspective. Nat Rev Genet 10 264 9
33. JakobsdottirJ
GorinMB
ConleyYP
FerrellRE
WeeksDE
2009 Interpretation of genetic association studies: markers with replicated highly significant odds ratios may be poor classifiers. PLoS Genet 5 e1000337 doi:10.1371/journal.pgen.1000337
34. SamaniNJ
RaitakariOT
SipiläK
TobinMD
SchunkertH
2008 Coronary artery disease-associated locus on chromosome 9p21 and early markers of atherosclerosis. Arterioscler Thromb Vasc Biol 28 1679 1683
35. FanYM
RaitakariOT
KähönenM
Hutri-KähönenN
JuonalaM
2009 Hepatic lipase promoter C-480T polymorphism is associated with serum lipids levels, but not subclinical atherosclerosis: The Cardiovascular Risk in Young Finns Study. Clin Genet 76 46 53
36. HumphriesSE
CooperJA
TalmudPJ
MillerGJ
2007 Candidate gene genotypes, along with conventional risk factor assessment, improve estimation of coronary heart disease risk in healthy UK men. Clin Chem 53 8 16
37. MorrisonAC
BareLA
ChamblessLE
EllisSG
MalloyM
2007 Prediction of coronary heart disease risk using a genetic risk score: the Atherosclerosis Risk in Communities Study. Am J Epidemiol 166 28 35
38. van der NetJB
JanssensAC
DefescheJC
KasteleinJJ
SijbrandsEJ
2009 Usefulness of genetic polymorphisms and conventional risk factors to predict coronary heart disease in patients with familial hypercholesterolemia. Am J Cardiol 103 375 380
39. van der NetJB
JanssensAC
SijbrandsEJ
SteyerbergEW
2009 Value of genetic profiling for the prediction of coronary heart disease. Am Heart J 158 105 110
40. IoannidisJP
2009 Prediction of cardiovascular disease outcomes and established cardiovascular risk factors by genome-wide association markers. Circ Cardiovasc Genet 2 7 15
41. PaynterNP
ChasmanDI
ParéG
BuringJE
CookNR
2010 Association between a literature-based genetic risk score and cardiovascular events in women. JAMA 303 631 637
42. CordellHJ
2009 Genome-wide association studies: Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 10 392 404
43. DonnellyP
2008 Progress and challenges in genome-wide association studies in humans. Commentary. Nature 456 728 731
44. MaherB
2008 Personal genomes: The case of the missing heritability. News Feature. Nature 456 18 21
45. SimonR
RadmacherMD
DobbinK
McShaneLM
2003 Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 95 14 18
46. RontuR
KarhunenPJ
IlveskoskiE
MikkelssonJ
KajanderO
2003 Smoking-dependent association between paraoxonase 1 M/L55 genotype and coronary atherosclerosis in males: an autopsy study. Atherosclerosis 171 31 37
47. McGeachieM
RamoniRL
MychaleckyjJC
FurieKL
DreyfussJM
2009 Integrative predictive model of coronary artery calcification in atherosclerosis. Circulation 120 2448 2454
48. BostromK
WatsonKE
HornS
WorthamC
HermanIM
1993 Bone morphogenetic protein expression in human atherosclerotic lesions. J Clin Invest 91 1800 1809
49. BucayN
SarosiI
DunstanCR
MoronyS
TarpleyJ
1998 Osteoprotegerin-deficient mice develop early onset osteoporosis and arterial calcification. Genes Dev 12 1260 1268
50. Collin-OsdobyP
2004 Regulation of vascular calcification by osteoclast regulatory factors RANKL and osteoprotegerin. Review. Circ Res 95 1046 1057
51. StephensM
BaldingDJ
2009 Bayesian statistical methods for genetic association studies. Nat Rev Genet 10 681 690
52. JanssensAC
van DuijnCM
2009 Genome-based prediction of common diseases: methodological considerations for future research. Genome Med 1 20
53. AmbroiseC
McLachlanGJ
2002 Selection bias in gene extraction on the basis of microarray gene-expression data. Proc Natl Acad Sci USA 99 6562 6566
54. PepeMS
JanesH
LongtonG
LeisenringW
NewcombP
2004 Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. Am J Epidemiol 159 882 890
55. IoannidisJP
ThomasG
DalyMJ
2009 Validating, augmenting and refining genome-wide association signals. Nat Rev Genet 10 318 329
56. ReunanenJ
2003 Overfitting in making comparisons between variable selection methods. J Machine Learn Res 3 1371 1382
57. AnderssenE
DyrstadK
WestadF
MartensH
2006 Reducing over-optimism in variable selection by cross-model validation. Chemometrics Intell Laborat Systems 84 69 74
58. DomingosP
PazzanM
1997 On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29 103 130
59. HandDJ
YuK
2001 Idiot's Bayes – not so stupid after all? International Statistical Rev 69 385 398
60. ZhangH
2005 Exploring conditions for the optimality of naïve Bayes. International J Patt Recogn Artif Intelligence 19 183 198
61. AittokallioJ
PoloO
HiissaJ
VirkkiA
ToikkaJ
2008 Overnight variability in transcutaneous carbon dioxide predicts vascular impairment in women. Exp Physiol 93 880 891
62. LongN
GianolaD
RosaGJ
WeigelKA
AvendañoS
2009 Comparison of classification methods for detecting associations between SNPs and chick mortality. Genet Sel Evol 41 18
63. HoggartCJ
WhittakerJC
De IorioM
BaldingDJ
2008 Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies. PLoS Genet 4 e1000130 doi:10.1371/journal.pgen.1000130
64. SilanderK
AlanneM
KristianssonK
SaarelaO
RipattiS
2008 Gender differences in genetic risk profiles for cardiovascular disease. PLoS ONE 3 e3615 doi:10.1371/journal.pone.0003615
65. HiissaJ
EloLL
HuhtinenK
PerheentupaA
PoutanenM
2009 Resampling reveals sample-level differential expression in clinical genome-wide studies. OMICS 13 381 396
66. RaitakariOT
JuonalaM
RönnemaaT
Keltikangas-JärvinenL
RäsänenL
2008 Cohort profile: the Cardiovascular Risk in Young Finns Study. Int J Epidemiol 37 1220 6
67. ÅkerblomHK
ViikariJ
UhariM
RäsänenL
BycklingT
1985 Atherosclerosis precursors in Finnish children and adolescents. I. General description of the cross-sectional study of 1980, and an account of the children's and families' state of health. Acta Paediatr Scand Suppl 318 49 63
68. RaikoJR
ViikariJS
IlmanenA
Hutri-KähönenN
TaittonenL
2010 Follow-ups of the Cardiovascular Risk in Young Finns Study in 2001 and 2007: Levels and 6-year changes in risk factors. J Intern Med 267 370 384
69. LivakKJ
1999 Allelic discrimination using fluorogenic probes and the 5′ nuclease assay. Genet Anal 14 143 149
70. EvansA
SalomaaV
KulathinalS
AsplundK
CambienF
2005 MORGAM (an international pooling of cardiovascular cohorts). Review. Int J Epidemiol 34 21 27
71. WittenIH
FrankE
2005 Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. San Francisco Morgan Kaufmann Publishers
72. JohnG
LangleyP
1995 Estimating continuous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference of Uncertainty in Artificial Intelligence San Mateo Morgan Kaufmann Publishers 338 345
73. LongN
GianolaD
RosaGJ
WeigelKA
AvendañoS
2007 Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. J Anim Breed Genet 124 377 389
74. PhillipsPC
2008 Epistasis: the essential role of gene interactions in the structure and evolution of genetic systems. Review. Nat Rev Genet 9 855 867
75. BaldingDJ
2006 A tutorial on statistical methods for population association studies. Nat Rev Genet 7 781 791
76. WojcikJ
FornerK
2008 ExactFDR: exact computation of false discovery rate estimate in case-control association studies. Bioinformatics 24 2407 2408
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2010 Číslo 9
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Synthesizing and Salvaging NAD: Lessons Learned from
- Optimal Strategy for Competence Differentiation in Bacteria
- Long- and Short-Term Selective Forces on Malaria Parasite Genomes
- Identifying Signatures of Natural Selection in Tibetan and Andean Populations Using Dense Genome Scan Data