Multi-tissue Analysis of Co-expression Networks by Higher-Order Generalized Singular Value Decomposition Identifies Functionally Coherent Transcriptional Modules
Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.
Vyšlo v časopise:
Multi-tissue Analysis of Co-expression Networks by Higher-Order Generalized Singular Value Decomposition Identifies Functionally Coherent Transcriptional Modules. PLoS Genet 10(1): e32767. doi:10.1371/journal.pgen.1004006
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1004006
Souhrn
Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.
Zdroje
1. BarabásiAL, GulbahceN, LoscalzoJ (2011) Network medicine: a network-based approach to human disease. Nature Reviews Genetics 12: 56–68.
2. ChoDY, KimYA, PrzytyckaTM (2012) Chapter 5: Network biology approach to complex diseases. PLoS Comput Biol 8: e1002820.
3. GholamiAM, FellenbergK (2010) Cross-species common regulatory network inference without requirement for prior gene affiliation. Bioinformatics 26: 1082–1090.
4. ChenY, ZhuJ, LumPY, YangX, PintoS, et al. (2008) Variations in DNA elucidate molecular networks that cause disease. Nature 452: 429–435.
5. LinB, WhiteJT, LuW, XieT, UtlegAG, et al. (2005) Evidence for the presence of diseaseperturbed networks in prostate cancer cells by genomic and proteomic analyses: A systems approach to disease. Cancer Research 65: 3081–3091.
6. MinJL, NicholsonG, HalgrimsdottirI, AlmstrupK, PetriA, et al. (2012) Coexpression network analysis in abdominal and gluteal adipose tissue reveals regulatory genetic loci for metabolic syndrome and related phenotypes. PLoS Genet 8: e1002505.
7. SchadtEE, FriendSH, ShaywitzDA (2009) A network view of disease and compound screening. Nature Reviews Drug Discovery 8: 286–295.
8. ChuangHY, LeeE, LiuYT, LeeD, IdekerT (2007) Network-based classification of breast cancer metastasis. Molecular Systems Biology 3: 140.
9. FrankeL, van BakelH, FokkensL, de JongED, Egmont-PetersenM, et al. (2006) Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. The American Journal of Human Genetics 78: 1011–1025.
10. SchadtEE, LambJ, YangX, ZhuJ, EdwardsS, et al. (2005) An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genetics 37: 710–717.
11. AlterO, BrownPO, BotseinD (2003) Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. PNAS 100: 3351–3356.
12. DawsonN, XiaoX, McDonaldM, HighamDJ, MorrisBJ, et al. (2012) Sustained NMDA receptor hypofunction induces compromised neural systems integration and schizophrenia-like alterations in functional brain networks. Cerebral cortex [epub ahead of print].
13. TessonB, BreitlingR, JansenR (2010) DiffCoEx: a simple and sensitive method to find differentially coexpressed gene modules. BMC Bioinformatics 11: 497.
14. XiaoX, DawsonN, MacIntyreL, MorrisB, PrattJ, et al. (2011) Exploring metabolic pathway disruption in the subchronic phencyclidine model of schizophrenia with the Generalized Singular Value Decomposition. BMC Systems Biology 5: 72.
15. LiW, LiuCC, ZhangT, LiH, WatermanMS, et al. (2011) Integrative analysis of many weighted co-expression networks using tensor computation. PLoS Comput Biol 7: e1001106.
16. LangfelderP, HorvathS (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9: 559.
17. ZhangB, HorvathS (2005) A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology 4: Article17.
18. RoyS, Werner-WashburneM, LaneT (2011) A multiple network learning approach to capture system-wide condition-specific responses. Bioinformatics 27: 1832–1838.
19. HighamDJ, KalnaG, KibbleM (2007) Spectral clustering and its use in bioinformatics. Journal of Computational and Applied Mathematics 204: 25–37.
20. KalnaG, VassJK, HighamDJ (2008) Multidimensional partitioning and bi-partitioning: analysis and application to gene expression datasets. International Journal of Computer Mathematics 85: 475–485.
21. ZhangW, EdwardsA, FanW, ZhuD, ZhangK (2010) svdPPCS: an effective singular value decomposition-based method for conserved and divergent co-expression gene module identification. BMC Bioinformatics 11: 338.
22. de SilvaE, StumpfMPH (2005) Complex networks and simple models in biology. Journal of the Royal Society Interface 2: 419–430.
23. LeeCH, AlpertBO, SankaranarayananP, AlterO (2012) GSVD comparison of patient-matched normal and tumor aCGH profiles reveals global copy-number alterations predicting glioblastoma multiforme survival. PLoS ONE 7: e30098.
24. Golub GH, Van Loan CF (1996) Matrix Computations. Baltimore: Johns Hopkins University Press, third edition.
25. PaigeCC, SaundersMA (1981) Towards a generalized singular value decomposition. SIAM Journal on Numerical Analysis 18: 398–405.
26. PonnapalliSP, SaundersMA, Van LoanCF, AlterO (2011) A Higher-Order Generalized Singular Value Decomposition for Comparison of Global mRNA Expression from Multiple Organisms. PLoS ONE 6: e28072.
27. HeinigM, PetrettoE, WallaceC, BottoloL, RotivalM, et al. (2010) A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk. Nature 467: 460–464.
28. Zhou XH, McClish DK, Obuchowski NA (2002) Statistical Methods in Diagnostic Medicine (Wiley Series in Probability and Statistics). Wiley-Interscience.
29. HubnerN, WallaceCA, ZimdahlH, PetrettoE, SchulzH, et al. (2005) Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nature Genetics 37: 243–253.
30. PetrettoE, SarwarR, GrieveI, LuH, KumaranMK, et al. (2008) Integrated genomic approaches implicate osteoglycin (Ogn) in the regulation of left ventricular mass. Nature Genetics 40: 546–552.
31. PravenecM, ChurchillPC, ChurchillMC, ViklickyO, KazdovaL, et al. (2008) Identification of renal cd36 as a determinant of blood pressure and risk for hypertension. Nature Genetics 40: 952–954.
32. FietzSA, LachmannR, BrandlH, KircherM, SamusikN, et al. (2012) Transcriptomes of germinal zones of human and mouse fetal neocortex suggest a role of extracellular matrix in progenitor self-renewal. Proceedings of the National Academy of Sciences 109: 11836–11841.
33. HuangDW, ShermanBT, LempickiRA (2008) Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nature Protocols 4: 44–57.
34. ShoemakerJ, LopesT, GhoshS, MatsuokaY, KawaokaY, et al. (2012) Cten: a web-based platform for identifying enriched cell types from heterogeneous microarray data. BMC Genomics 13: 460.
35. RossinEJ, LageK, RaychaudhuriS, XavierRJ, TatarD, et al. (2011) Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet 7: e1001273.
36. RoiderHG, MankeT, O'KeeffeS, VingronM, HaasSA (2009) Pastaa: identifying transcription factors associated with sets of co-regulated genes. Bioinformatics 25: 435–442.
37. MorimotoRI (1998) Regulation of the heat shock transcriptional response: cross talk between a family of heat shock factors, molecular chaperones, and negative regulators. Genes & Development 12: 3788–3796.
38. MaH, GongH, ChenZ, LiangY, YuanJ, et al. (2012) Association of stat3 with HSF1 plays a critical role in g-csf-induced cardio-protection against ischemia/reperfusion injury. Journal of Molecular and Cellular Cardiology 52: 1282–1290.
39. StephanouA, IsenbergDA, NakajimaK, LatchmanDS (1999) Signal transducer and activator of transcription-1 and heat shock factor-1 interact and activate the transcription of the hsp-70 and hsp-90β gene promoters. Journal of Biological Chemistry 274: 1723–1728.
40. KimuraA (2010) Molecular basis of hereditary cardiomyopathy: abnormalities in calcium sensitivity, stretch response, stress response and beyond. Journal of Human Genetics 55: 81–90.
41. ZhangB, KirovS, SnoddyJ (2005) Webgestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Research 33: W741–W748.
42. ZhaoZ, HuangY, ZhangB, ShyrY, XuH (2012) Genomics in 2012: challenges and opportunities in the next generation sequencing era. BMC Genomics 13: S1.
43. StraussM, PorrasN (2007) Differential expression of hsp70 and ultrastructure of heart and liver tissues of rats treated with adriamycin: protective role of l-carnitine. Investigación Clínica 48: 33.
44. SchiaffonatiL, TacchiniL, PappalardoC (2005) Heat shock response in the liver: expression and regulation of the hsp70 gene family and early response genes after in vivo hyperthermia. Hepatology 20: 975–983.
45. SchäferJ, StrimmerK, et al. (2005) A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology 4: 32.
46. HannenhalliS, PuttME, GilmoreJM, WangJ, ParmacekMS, et al. (2006) Transcriptional genomics associates fox transcription factors with human heart failure. Circulation 114: 1269–1276.
47. SchadtEE, MolonyC, ChudinE, HaoK, YangX, et al. (2008) Mapping the genetic architecture of gene expression in human liver. PLoS Biology 6: e107.
48. KnowltonAA, KapadiaS, Torre-AmioneG, DurandJB, BiesR, et al. (1998) Differential expression of heat shock proteins in normal and failing human hearts. Journal of Molecular and Cellular Cardiology 30: 811–818.
49. LatifN, TaylorP, KhanM, YacoubM, DunnM (1999) The expression of heat shock protein 60 in patients with dilated cardiomyophathy. Basic Research in Cardiology 94: 112–119.
50. PockleyA, FrostegårdJ (2005) Heat shock proteins in cardiovascular disease and the prognostic value of heat shock protein related measurements. Heart 91: 1124.
51. WillisMS, PattersonC (2013) Proteotoxicity and cardiac dysfunction alzheimer's disease of the heart? New England Journal of Medicine 368: 455–464.
52. RamaswamyS, TamayoP, RifkinR, MukherjeeS, YeangCH, et al. (2001) Multiclass cancer diagnosis using tumor gene expression signatures. Proceedings of the National Academy of Sciences 98: 15149–15154.
53. AbeytaMJ, ClarkAT, RodriguezRT, BodnarMS, PeraRAR, et al. (2004) Unique gene expression signatures of independently-derived human embryonic stem cell lines. Human Molecular Genetics 13: 601–608.
54. RossignolR, LetellierT, MalgatM, RocherC, MazatJP (2000) Tissue variation in the control of oxidative phosphorylation: implication for mitochondrial diseases. Biochem J 347: 45–53.
55. HondaK, YanaiH, NegishiH, AsagiriM, SatoM, et al. (2005) IRF-7 is the master regulator of type-I interferon-dependent immune responses. Nature 434: 772–777.
56. NathanC, DingA (2010) Nonresolving inflammation. Cell 140: 871–882.
57. RoepB (2003) The role of T-cells in the pathogenesis of Type 1 diabetes: from cause to cure. Diabetologia 46: 305–321.
58. SchwartzMA, AssoianRK (2001) Integrins and cell proliferation: regulation of cyclin-dependent kinases via cytoplasmic signaling pathways. Journal of Cell Science 114: 2553–2560.
59. PalmT, HemmerK, WinterJ, FrickeIB, TarbashevichK, et al. (2013) A systemic transcriptome analysis reveals the regulation of neural stem cell maintenance by an E2F1–miRNA feedback loop. Nucleic Acids Research 41: 3699–3712.
60. SchmidtM, HuberL, MajdazariA, SchützG, WilliamsT, et al. (2011) The transcription factors ap-2β and ap-2α are required for survival of sympathetic progenitors and differentiated sympathetic neurons. Developmental Biology 355: 89–100.
61. LiuX, SomelM, TangL, YanZ, JiangX, et al. (2012) Extension of cortical synaptic development distinguishes humans from chimpanzees and macaques. Genome Research 22: 611–22.
62. KinoshitaK, ObayashiT (2009) Multi-dimensional correlations for gene coexpression and application to the large-scale data of arabidopsis. Bioinformatics 25: 2677–2684.
63. MeyerPE, LafitteF, BontempiG (2008) minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC bioinformatics 9: 461.
64. XiangT, GongS (2008) Spectral clustering with eigenvector selection. Pattern Recognition 41: 1012–1029.
65. StrimmerK (2008) fdrtool: a versatile R package for estimating local and tail area-based false discovery rates. Bioinformatics 24: 1461–1462.
66. IrizarryRA, HobbsB, CollinF, Beazer-BarclayYD, AntonellisKJ, et al. (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4: 249–264.
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2014 Číslo 1
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
Najčítanejšie v tomto čísle
- GATA6 Is a Crucial Regulator of Shh in the Limb Bud
- Large Inverted Duplications in the Human Genome Form via a Fold-Back Mechanism
- Down-Regulation of eIF4GII by miR-520c-3p Represses Diffuse Large B Cell Lymphoma Development
- Genome Sequencing Highlights the Dynamic Early History of Dogs