Mining the Unknown: A Systems Approach to Metabolite Identification Combining Genetic and Metabolic Information
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Vyšlo v časopise:
Mining the Unknown: A Systems Approach to Metabolite Identification Combining Genetic and Metabolic Information. PLoS Genet 8(10): e32767. doi:10.1371/journal.pgen.1003005
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1003005
Souhrn
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Zdroje
1. GiegerC, GeistlingerL, AltmaierE, deMH, KronenbergF, et al. (2008) Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genet 4: e1000282 doi:10.1371/journal.pgen.1000282
2. IlligT, GiegerC, ZhaiG, Römisch-MarglW, Wang-SattlerR, et al. (2010) A genome-wide perspective of genetic variation in human metabolism. Nat Genet 42: 137–141.
3. SuhreK, WallaschofskiH, RafflerJ, FriedrichN, HaringR, et al. (2011) A genome-wide association study of metabolic traits in human urine. Nat Genet 43: 565–569.
4. NicholsonG, RantalainenM, LiJV, MaherAD, MalmodinD, et al. (2011) A genome-wide metabolic QTL analysis in Europeans implicates two loci shaped by recent positive selection. PLoS Genet 7: e1002270 doi:10.1371/journal.pgen.1002270
5. SuhreK, ShinS-Y, PetersenA-K, MohneyRP, MeredithD, et al. (2011) Human metabolic individuality in biomedical and pharmaceutical research. Nature 477: 54–60.
6. HoraiH, AritaM, KanayaS, NiheiY, IkedaT, et al. (2010) MassBank: a public repository for sharing mass spectral data for life sciences. J Mass Spectrom 45: 703–714.
7. Afeefy HY, Liebman JF, Stein SE (2011) NIST Chemistry WebBook, NIST Standard Reference Database Number 69. In: Linstrom PJ, Mallard WG, editors.
8. WishartDS, KnoxC, GuoAC, EisnerR, YoungN, et al. (2009) HMDB: a knowledgebase for the human metabolome. Nucleic Acids Res 37: D603–D610.
9. GallWE, BeebeK, LawtonKA, AdamK-P, MitchellMW, et al. (2010) alpha-hydroxybutyrate is an early biomarker of insulin resistance and glucose intolerance in a nondiabetic population. PLoS ONE 5: e10883 doi:10.1371/journal.pone.0010883
10. FiehnO, GarveyWT, NewmanJW, LokKH, HoppelCL, et al. (2010) Plasma metabolomic profiles reflective of glucose homeostasis in non-diabetic and type 2 diabetic obese African-American women. PLoS ONE 5: e15234 doi:10.1371/journal.pone.0015234
11. SteffensDC, JiangW, RKR, KarolyED, MitchellMW, et al. (2010) Metabolomic differences in heart failure patients with and without major depression. J Geriatr Psychiatry Neurol 23: 138–146.
12. KindT, FiehnO (2007) Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinformatics 8: 105.
13. BowenBP, NorthenTR (2010) Dealing with the unknown: metabolomics and metabolite atlases. J Am Soc Mass Spectrom 21: 1471–1476.
14. WishartDS (2011) Advances in metabolite identification. Bioanalysis 3: 1769–1782.
15. RascheF, SvatošA, MaddulaRK, BöttcherC, BöckerS (2011) Computing Fragmentation Trees from Tandem Mass Spectrometry Data. Analytical Chemistry 83: 1243–1251.
16. MihalevaVV, VerhoevenHA, de VosRCH, HallRD, van HamRCHJ (2009) Automated procedure for candidate compound selection in GC-MS metabolomics based on prediction of Kovats retention index. Bioinformatics 25: 787–794.
17. CreekDJ, JankevicsA, BreitlingR, WatsonDG, BarrettMP, et al. (2011) Towards Global Metabolomics Analysis with Liquid Chromatography-Mass Spectrometry: Improved Metabolite Identification by Retention Time Prediction. Anal Chem
18. BöckerS, LetzelMC, LiptákZ, PervukhinA (2009) SIRIUS: decomposing isotope patterns for metabolite identification. Bioinformatics 25: 218–224.
19. GipsonG, TatsuokaK, SokhansanjB, BallR, ConnorS (2008) Assignment of MS-based metabolomic datasets via compound interaction pair mapping. Metabolomics 4: 94–103.
20. WeberRJM, ViantMR (2010) MI-Pack: Increased confidence of metabolite identification in mass spectra by integrating accurate masses and metabolic pathways. Chemometrics and Intelligent Laboratory Systems 104: 75–82.
21. KrumsiekJ, SuhreK, IlligT, AdamskiJ, TheisFJ (2011) Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Syst Biol 5: 21.
22. MittelstrassK, RiedJS, YuZ, KrumsiekJ, GiegerC, et al. (2011) Discovery of Sexual Dimorphisms in Metabolic and Genetic Biomarkers. PLoS Genet 7: e1002215 doi:10.1371/journal.pgen.1002215
23. NayakRR, KearnsM, SpielmanRS, CheungVG (2009) Coexpression network based on natural variation in human gene expression reveals gene interactions and functions. Genome Res 19: 1953–1962.
24. SzklarczykD, FranceschiniA, KuhnM, SimonovicM, RothA, et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39: D561–D568.
25. HolleR, HappichM, LöwelH, WichmannHE (2005) Group MKS (2005) KORA–a research platform for population based health research. Gesundheitswesen 67 Suppl 1: S19–S25.
26. Hindorff L, MacArthur J, Wise A, Junkins H, Hall P, et al. A Catalog of Published Genome-Wide Association Studies.
27. TakeuchiF, McGinnisR, BourgeoisS, BarnesC, ErikssonN, et al. (2009) A genome-wide association study confirms VKORC1, CYP2C9, and CYP4F2 as principal genetic determinants of warfarin dose. PLoS Genet 5: e1000433 doi:10.1371/journal.pgen.1000433
28. ChungCM, WangRY, ChenJW, FannCS, LeuHB, et al. (2010) A genome-wide association study identifies new loci for ACE activity: potential implications for response to ACE inhibitor. Pharmacogenomics J 10: 537–544.
29. GTEx (Genotype-Tissue Expression) eQTL Browser.
30. OtternessDM, WiebenED, WoodTC, WatsonWG, MaddenBJ, et al. (1992) Human liver dehydroepiandrosterone sulfotransferase: molecular cloning and expression of cDNA. Mol Pharmacol 41: 865–872.
31. Berg JM, Tymoczko JL, Stryer L (2006) Biochemistry: W. H. Freeman.
32. TateSS, MeisterA (1985) gamma-Glutamyl transpeptidase from kidney. Methods Enzymol 113: 400–419.
33. KovátsE (1958) Gas-chromatographische Charakterisierung organischer Verbindungen. Teil 1: Retentionsindices aliphatischer Halogenide, Alkohole, Aldehyde und Ketone. Helvetica Chimica Acta 41: 1915–1932.
34. Milburn M, Guo L, WULFF JE, Lawton KA (2010) DETERMINATION OF THE LIVER TOXICITY OF AN AGENT.
35. MännistöPT, KaakkolaS (1999) Catechol-O-methyltransferase (COMT): biochemistry, molecular biology, pharmacology, and clinical efficacy of the new selective COMT inhibitors. Pharmacol Rev 51: 593–628.
36. Bowers-KomroDM, McCormickDB, KingGA, SweenyJG, IacobucciGA (1982) Confirmation of 2-O-methyl ascorbic acid as the product from the enzymatic methylation of L-ascorbic acid by catechol-O-methyltransferase. Int J Vitam Nutr Res 52: 186–193.
37. ButterworthM, LauSS, MonksTJ (1996) 17 beta-Estradiol metabolism by hamster hepatic microsomes. Implications for the catechol-O-methyl transferase-mediated detoxication of catechol estrogens. Drug Metab Dispos 24: 588–594.
38. ImigJD (2004) ACE Inhibition and Bradykinin-Mediated Renal Vascular Responses: EDHF Involvement. Hypertension 43: 533–535.
39. AcharyaKR, SturrockED, RiordanJF, WMR (2003) Ace revisited: a new target for structure-based drug design. Nat Rev Drug Discov 2: 891–902.
40. AdamsSH, HoppelCL, LokKH, ZhaoL, WongSW, et al. (2009) Plasma acylcarnitine profiles suggest incomplete long-chain fatty acid beta-oxidation and altered tricarboxylic acid cycle activity in type 2 diabetic African-American women. J Nutr 139: 1073–1081.
41. PurcellS, NealeB, Todd-BrownK, ThomasL, RMA, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81: 559–575.
42. The International HapMap 3 Consortium (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467: 52–58.
43. BuurenSv, Groothuis-OudshoornK (2010) MICE: Multivariate Imputation by Chained Equations in R. Journal of statistical software in press 1–68.
44. Fox J (1997) Applied Regression Analysis, Linear Models, and Related Methods: Sage Publications.
45. DuarteNC, BeckerSA, JamshidiN, ThieleI, MoML, et al. (2007) Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proc Natl Acad Sci U S A 104: 1777–1782.
46. MaH, SorokinA, MazeinA, SelkovA, SelkovE, et al. (2007) The Edinburgh human metabolic network reconstruction and its functional analysis. Mol Syst Biol 3: 135.
47. KanehisaM, GotoS (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28: 27–30.
48. AshburnerM, BallCA, BlakeJA, BotsteinD, ButlerH, et al. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29.
49. JohnsonAD, KavousiM, SmithAV, ChenMH, DehghanA, et al. (2009) Genome-wide association meta-analysis for total serum bilirubin levels. Hum Mol Genet 18: 2700–2710.
50. BielinskiSJ, ChaiHS, PathakJ, TalwalkarJA, LimburgPJ, et al. (2011) Mayo Genome Consortia: a genotype-phenotype resource for genome-wide association studies with an application to the analysis of circulating bilirubin levels. Mayo Clin Proc 86: 606–614.
51. LinkE, ParishS, ArmitageJ, BowmanL, HeathS, et al. (2008) SLCO1B1 variants and statin-induced myopathy–a genomewide study. N Engl J Med 359: 789–799.
52. ChambersJC, ZhangW, LordGM, van der HarstP, LawlorDA, et al. (2010) Genetic loci influencing kidney function and chronic kidney disease. Nat Genet 42: 373–375.
53. KottgenA, PattaroC, BogerCA, FuchsbergerC, OldenM, et al. (2010) New loci associated with kidney function and chronic kidney disease. Nat Genet 42: 376–384.
54. ZhaiG, TeumerA, StolkL, BJR, VandenputL, et al. (2011) Eight common genetic variants associated with serum DHEAS levels suggest a key role in ageing mechanisms. PLoS Genet 7: e1002025 doi:10.1371/journal.pgen.1002025
55. SannaS, BusoneroF, MaschioA, McArdlePF, UsalaG, et al. (2009) Common variants in the SLCO1B3 locus are associated with bilirubin levels and unconjugated hyperbilirubinemia. Hum Mol Genet 18: 2711–2718.
56. ChenG, RamosE, AdeyemoA, ShrinerD, ZhouJ, et al. (2012) UGT1A1 is a major locus influencing bilirubin levels in African Americans. Eur J Hum Genet 20: 463–468.
57. JylhavaJ, LyytikainenLP, KahonenM, Hutri-KahonenN, KettunenJ, et al. (2012) A genome-wide association study identifies UGT1A1 as a regulator of serum cell-free DNA in young adults: The Cardiovascular Risk in Young Finns Study. PLoS ONE 7: e35426 doi:10.1371/journal.pone.0035426
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2012 Číslo 10
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- A Mutation in the Gene Causes Alternative Splicing Defects and Deafness in the Bronx Waltzer Mouse
- Mutations in (Hhat) Perturb Hedgehog Signaling, Resulting in Severe Acrania-Holoprosencephaly-Agnathia Craniofacial Defects
- Classical Genetics Meets Next-Generation Sequencing: Uncovering a Genome-Wide Recombination Map in
- Regulation of ATG4B Stability by RNF5 Limits Basal Levels of Autophagy and Influences Susceptibility to Bacterial Infection