#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in


Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP) genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using ∼2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP) model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012) for starvation resistance (startle response). The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNP–based modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms.


Vyšlo v časopise: Using Whole-Genome Sequence Data to Predict Quantitative Trait Phenotypes in. PLoS Genet 8(5): e32767. doi:10.1371/journal.pgen.1002685
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1002685

Souhrn

Predicting organismal phenotypes from genotype data is important for plant and animal breeding, medicine, and evolutionary biology. Genomic-based phenotype prediction has been applied for single-nucleotide polymorphism (SNP) genotyping platforms, but not using complete genome sequences. Here, we report genomic prediction for starvation stress resistance and startle response in Drosophila melanogaster, using ∼2.5 million SNPs determined by sequencing the Drosophila Genetic Reference Panel population of inbred lines. We constructed a genomic relationship matrix from the SNP data and used it in a genomic best linear unbiased prediction (GBLUP) model. We assessed predictive ability as the correlation between predicted genetic values and observed phenotypes by cross-validation, and found a predictive ability of 0.239±0.008 (0.230±0.012) for starvation resistance (startle response). The predictive ability of BayesB, a Bayesian method with internal SNP selection, was not greater than GBLUP. Selection of the 5% SNPs with either the highest absolute effect or variance explained did not improve predictive ability. Predictive ability decreased only when fewer than 150,000 SNPs were used to construct the genomic relationship matrix. We hypothesize that predictive power in this population stems from the SNP–based modeling of the subtle relationship structure caused by long-range linkage disequilibrium and not from population structure or SNPs in linkage disequilibrium with causal variants. We discuss the implications of these results for genomic prediction in other organisms.


Zdroje

1. MackayTFCStoneEAAyrolesJF 2009 The genetics of quantitative traits: Challenges and prospects. Nat Rev Genet 10 565 577 doi:10.1038/nrg2612

2. WrayNRGoddardMEVisscherPM 2007 Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res 17 1520 1528

3. de los CamposGGianolaDAllisonDB 2010 Predicting genetic predisposition in humans: The promise of whole-genome markers. Nat Rev Genet 11 880 886 doi:10.1038/nrg2898

4. HayesBJBowmanPJChamberlainAJGoddardME 2009 Genomic selection in dairy cattle: Progress and challenges. J Dairy Sci 92 433 443

5. LorenzAJChaoSAsoroFGHeffnerELHayashiT 2011 Genomic selection in plant breeding: Knowledge and prospects. Adv Agron 110 77 123

6. HendersonCR 1973 Sire evaluation and genetic trends. J Anim Sci 1973 10 41

7. RanadeKChangMSTingCTPeiDHsiaoCF 2001 High-throughput genotyping with single nucleotide polymorphisms. Genome Res 11 1262 1268

8. VanRadenPM 2008 Efficient methods to compute genomic predictions. J Dairy Sci 91 4414 4423

9. GoddardM 2009 Genomic selection: Prediction of accuracy and maximisation of long-term response. Genetica 185 1021 1031

10. MeuwissenTHEHayesBJGoddardME 2001 Prediction of total genetic value using genomewide dense marker maps. Genetics 157 1819 1829

11. FisherRA 1918 The correlation between relatives under the supposition of mendelian inheritance. Trans Roy Soc Edinburgh 52 399 433

12. PimentelEErbeMKoenigSSimianerH 2011 Genome partitioning of genetic variation for milk production and composition traits in Holstein cattle. Front Gene 2 doi:10.3389/fgene.2011.00019

13. SchönCCUtzHFGrohSTrubergBOpenshawS 2004 Quantitative trait locus mapping based on resampling in a vast maize testcross experiment and its relevance to quantitative genetics for complex traits. Genetics 167 485 498 doi:10.1534/genetics.167.1.485

14. MackayTFC 2004 The genetic architecture of quantitative traits: Lessons from Drosophila. Curr Opin Genetics Dev 14 253 257

15. FlintJMackayTFC 2009 Genetic architecture of quantitative traits in mice, ies, and humans. Genome Res 19 723 733 doi:0.1101/gr.086660.108

16. EckSHBenet-PagèsAFlisikowskiKMeitingerTFriesR 2009 Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery. Genome Biol 10 doi:10.1186/gb-2009-10-8-r82

17. The 1000 Genomes Project Consortium 2010 A map of human genome variation from populationscale sequencing. Nature 467 1061 1073

18. ElshireRJGlaubitzJCSunQPolandJAKawamotoK 2011 A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 6 e0019379 doi:10.1371/journal.pone.0019379

19. HayesBJPryceJChamberlainAJBowmanPJGoddardME 2010 Genetic architecture of complex traits and accuracy of genomic prediction: Coat colour, milk-fat percentage, and type in Holstein cattle as contrasting model traits. PLoS Genet 6 e1001139 doi:10.1371/journal.pgen.1001139

20. DaetwylerHDPong-WongRVillanuevaBWoolliamsJA 2010 The impact of genetic architecture on genome-wide evaluation methods. Genetics 185 1021 1031

21. GianolaDde los CamposGHillWGManfrediEFernandoR 2009 Additive genetic variability and the Bayesian alphabet. Genetics 183 347 363

22. GianolaDvan KaamJBCHM 2008 Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 178 2289 2303

23. de los CamposGGianolaDRosaGJM 2009 Reproducing kernel Hilbert spaces regression: A general framework for genetic evaluation. J Anim Sci 87 1883 1887

24. LongNGianolaDRosaGJMWeigelKAKranisA 2010 Radial basis function regression methods for predicting quantitative traits using SNP markers. Genet Res 92 209 225

25. OberUErbeELongNPorcuESchlatherM 2011 Predicting genetic values: A kernelbased best linear unbiased prediction with genomic data. Genetics 188 695 708

26. MeuwissenTGoddardM 2010 Accurate prediction of genetic values for complex traits by wholegenome resequencing. Genetics 185 623 631

27. MackayTFCRichardsSStoneEABarbadillaAAyrolesJF 2012 The Drosophila Genetic Reference Panel. Nature 482 173 178 doi:10.1038/nature10811

28. AyrolesJFCarboneMAStoneEAJordanKWLymanRF 2009 Systems genetics of complex traits in Drosophila melanogaster. Nat Genet 41 299 307

29. HarbisonSTYamamotoAHFanaraJJNorgaKKMackayTFC 2004 Quantitative trait loci affecting starvation resistance in Drosophila melanogaster. Genetics 166 1807 1823

30. JordanKWCarboneMAYamamotoAMorganTJMackayTFC 2007 Quantitative genomics of locomotor behavior in Drosophila melanogaster. Genome Biol 8 doi:10.1186/gb-2007-8-8-r172

31. MakowskyRPajewskiNMKlimentidisYCVazquezAIDuarteCW 2011 Beyond missing heritability: Prediction of complex traits. PLoS Genet 7 e1002051 doi:10.1371/journal.pgen.1002051

32. EfronBTibshiraniR 1986 Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statist Sci 1 54 75

33. KusakabeSYamaguchiYBabaHMukaiT 2000 The genetic structure of the Raleigh natural population of Drosophila melanogaster revisited. Genetics 154 679 685

34. FalconerDSMackayTFC 1996 Introduction to quantitative genetics Harlow, England Pearson

35. QanbariSPimentelETetensJThallerGLichtnerP 2010 The pattern of linkage disequilibrium in german Holstein cattle. Anim Genet 41 346 356 doi:10.1111/j.1365-2052.2009.02011.x

36. TenesaANavarroPHayesBJ 2007 Recent human effective population size estimated from linkage disequilibrium. Genom Res 17 520 526

37. HabierDFernandoRLDekkersJCM 2007 The impact of genetic relationship information on genome-assisted breeding values. Genetics 177 2389 2397

38. MeuwissenTHE 2009 Accuracy of breeding values of ‘unrelated’ individuals predicted by dense SNP genotyping. Genet Sel Evol 41 doi:10.1186/1297-9686-41-35

39. VisscherPMMedlandSEFerreiraMARMorleyKIZhuG 2006 Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full sublings. PLoS Genet 2 e0020041 doi:10.1371/journal.pgen.0020041

40. GonzálezJPetrovDA 2009 The adaptive role of transposable elements in the Drosophila genome. Gene 448 124 133

41. VanRadenPMVan TassellCPWiggansGRSonstegardTSSchnabelRD 2009 Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci 92 16 24

42. AulchenkoYSStruchalinMVBelonogovaNMAxenovichTIWeedonMN 2009 Predicting human height by Victorian and genomic methods. Eur J Human Genet 17 1070 1075

43. BrowningBLBrowningSR 2009 A unified approach to genotype imputation and haplotype phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet 84 210 223

44. StoneM 1974 Cross-validation choice and assessment of statistical predictions. J Roy Statist Soc B 36 111 147

45. StoneM 1977 An aymptotic equivalence of choice of model by cross-validation and Akaike's criterion. J Roy Statist Soc B 39 44 47

46. AllenD 1977 The relationship between variable selection and data augmentation and a method of prediction. Technometrics 16 125 127

47. LegarraARobert-Grani_eCManfrediEElsenJM 2008 Performance of genomic selection in mice. Genetics 180 611 618

48. HillWGWeirBS 1995 Maximum likelihood estimation of gene location by linkage disequilibrium. Am J Hum Genet 54 704 714

49. AdamsMDCelnikerSEHoltRAEvansCAGocayneJD 2000 The genome sequence of Drosophila melanogaster. Science 287 2185 2195 doi:10.1126/science.287.5461.2185

50. Fiston-LavierASSinghNDLipatovMPetrovDA 2010 Drosophila melanogaster recombination rate calculator. Gene 463 18 20

51. EfronB 1987 Better bootstrap confidence intervals. J Am Stat Assoc 82 171 185

52. SvedJA 1971 Linkage disequilibrium and homozygosity of chromosome segments in finite populations. Theor Popul Biol 2 125 141

53. HendersonCR 1984 Applications of Linear Models in Animal Breeding Guelph, Canada University of Guelph

54. GilmourARGogelBJCullisBRThompsonR 2006 ASReml user guide release 2.0 Hemel Hempstead, UK VSN International Ltd.

55. IhakaRGentlemanR 1996 R: A language for data analysis and graphics. J Comput Graph Statist 5 299 314

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2012 Číslo 5
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#