#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Integrative Modeling of eQTLs and Cis-Regulatory Elements Suggests Mechanisms Underlying Cell Type Specificity of eQTLs


Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL) mapping has paralleled the adoption of genome-wide association studies (GWAS) for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE) data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human complex phenotypic variation.


Vyšlo v časopise: Integrative Modeling of eQTLs and Cis-Regulatory Elements Suggests Mechanisms Underlying Cell Type Specificity of eQTLs. PLoS Genet 9(8): e32767. doi:10.1371/journal.pgen.1003649
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1003649

Souhrn

Genetic variants in cis-regulatory elements or trans-acting regulators frequently influence the quantity and spatiotemporal distribution of gene transcription. Recent interest in expression quantitative trait locus (eQTL) mapping has paralleled the adoption of genome-wide association studies (GWAS) for the analysis of complex traits and disease in humans. Under the hypothesis that many GWAS associations tag non-coding SNPs with small effects, and that these SNPs exert phenotypic control by modifying gene expression, it has become common to interpret GWAS associations using eQTL data. To fully exploit the mechanistic interpretability of eQTL-GWAS comparisons, an improved understanding of the genetic architecture and causal mechanisms of cell type specificity of eQTLs is required. We address this need by performing an eQTL analysis in three parts: first we identified eQTLs from eleven studies on seven cell types; then we integrated eQTL data with cis-regulatory element (CRE) data from the ENCODE project; finally we built a set of classifiers to predict the cell type specificity of eQTLs. The cell type specificity of eQTLs is associated with eQTL SNP overlap with hundreds of cell type specific CRE classes, including enhancer, promoter, and repressive chromatin marks, regions of open chromatin, and many classes of DNA binding proteins. These associations provide insight into the molecular mechanisms generating the cell type specificity of eQTLs and the mode of regulation of corresponding eQTLs. Using a random forest classifier with cell specific CRE-SNP overlap as features, we demonstrate the feasibility of predicting the cell type specificity of eQTLs. We then demonstrate that CREs from a trait-associated cell type can be used to annotate GWAS associations in the absence of eQTL data for that cell type. We anticipate that such integrative, predictive modeling of cell specificity will improve our ability to understand the mechanistic basis of human complex phenotypic variation.


Zdroje

1. CooksonW, LiangL, AbecasisG, MoffattM, LathropM (2009) Mapping complex disease traits with global gene expression. Nature Reviews Genetics 10: 184–94.

2. EmilssonV, ThorleifssonG, ZhangB, LeonardsonAS, ZinkF, et al. (2008) Genetics of gene expression and its effect on disease. Nature 452: 423–8.

3. GiladY, RifkinSA, PritchardJK (2008) Revealing the architecture of gene regulation: the promise of eQTL studies. Trends in genetics 24: 408–415.

4. BremRB, YvertG, ClintonR, KruglyakL (2002) Genetic dissection of transcriptional regulation in budding yeast. Science 296: 752–5.

5. SchadtEE, MonksSA, DrakeTA, LusisAJ, CheN, et al. (2003) Genetics of gene expression surveyed in maize, mouse and man. Nature 422: 297–302.

6. MorleyM, MolonyCM, WeberTM, DevlinJL, EwensKG, et al. (2004) Genetic analysis of genomewide variation in human gene expression. Nature 430: 743–7.

7. De GobbiM, ViprakasitV, HughesJR, FisherC, BuckleVJ, et al. (2006) A regulatory SNP causes a human genetic disease by creating a new transcriptional promoter. Science 312: 1215–7.

8. SmallKS, HedmanAK, GrundbergE, NicaAC, ThorleifssonG, et al. (2011) Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nature Genetics 43: 561–4.

9. GöringHHH, CurranJE, JohnsonMP, DyerTD, CharlesworthJ, et al. (2007) Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nature Genetics 39: 1208–16.

10. MoffattMF, KabeschM, LiangL, DixonAL, StrachanD, et al. (2007) Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448: 470–3.

11. EmisonES, Garcia-BarceloM, GriceEA, LantieriF, AmielJ, et al. (2010) Differential contributions of rare and common, coding and noncoding Ret mutations to multifactorial Hirschsprung disease liability. American journal of human genetics 87: 60–74.

12. MauranoMT, HumbertR, RynesE, ThurmanRE, HaugenE, et al. (2012) Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science 337: 1190–1195.

13. NicolaeDL, GamazonE, ZhangW, DuanS, DolanME, et al. (2010) Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genetics 6: e1000888.

14. FraserHB, XieX (2009) Common polymorphic transcript variation in human disease. Genome Research 19: 567–75.

15. MusunuruK, StrongA, Frank-KamenetskyM, LeeNE, AhfeldtT, et al. (2010) From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466: 714–9.

16. HarismendyO, NotaniD, SongX, RahimNG, TanasaB, et al. (2011) 9p21 DNA variants associated with coronary artery disease impair interferon-γ signalling response. Nature 470: 264–8.

17. DimasAS, DeutschS, StrangerBE, MontgomerySB, BorelC, et al. (2009) Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325: 1246–50.

18. FairfaxBP, MakinoS, RadhakrishnanJ, PlantK, LeslieS, et al. (2012) Genetics of gene expression in primary immune cells identifies cell type-specific master regulators and roles of HLA alleles. Nature Genetics 44: 502–10.

19. PowellJE, HendersAK, McRaeAF, WrightMJ, MartinNG, et al. (2012) Genetic control of gene expression in whole blood and lymphoblastoid cell lines is largely independent. 22: 456–66.

20. HuangDW, ShermanBT, ZhengX, YangJ, ImamichiT, et al. (2009) Extracting biological meaning from large gene lists with DAVID. Current protocols in bioinformatics Chapter 13: Unit 13.11.

21. van NasA, Ingram-DrakeL, SinsheimerJS, WangSS, SchadtEE, et al. (2010) Expression quantitative trait loci: replication, tissue- and sex-specificity in mice. Genetics 185: 1059–68.

22. NicaAC, PartsL, GlassD, NisbetJ, BarrettA, et al. (2011) The architecture of gene regulatory variation across multiple human tissues: the MuTHER study. PLoS Genetics 7: e1002003.

23. DingJ, GudjonssonJE, LiangL, StuartPE, LiY, et al. (2010) Gene expression in skin and lymphoblastoid cells: Refined statistical method reveals extensive overlap in cis-eQTL signals. American journal of human genetics 87: 779–89.

24. FuJ, WolfsMGM, DeelenP, WestraHJ, FehrmannRSN, et al. (2012) Unraveling the regulatory mechanisms underlying tissue-dependent genetic variation of gene expression. PLoS Genetics 8: e1002431.

25. GerritsA, LiY, TessonBM, BystrykhLV, WeersingE, et al. (2009) Expression quantitative trait loci are highly sensitive to cellular differentiation state. PLoS Genetics 5: e1000692.

26. InnocentiF, CooperGM, StanawayIB, GamazonER, SmithJD, et al. (2011) Identification, replication, and functional fine-mapping of expression quantitative trait Loci in primary human liver tissue. PLoS Genetics 7: e1002078.

27. HeapGA, TrynkaG, JansenRC, BruinenbergM, SwertzMA, et al. (2009) Complex nature of SNP genotype effects on gene expression in primary human leucocytes. BMC medical genomics 2: 1.

28. FlutreT, WenX, PritchardJ, StephensM (2013) A Statistical Framework for Joint eQTL Analysis in Multiple Tissues. PLoS Genetics 9: e1003486.

29. ThurmanRE, RynesE, HumbertR, VierstraJ, MauranoMT, et al. (2012) The accessible chromatin landscape of the human genome. Nature 489: 75–82.

30. DunhamI, KundajeA, AldredSF, CollinsPJ, DavisCA, et al. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature 489: 57–74.

31. StrangerBE, ForrestMS, DunningM, IngleCE, BeazleyC, et al. (2007) Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315: 848–853.

32. MyersAJ, GibbsJR, WebsterJa, RohrerK, ZhaoA, et al. (2007) A survey of genetic human cortical gene expression. Nature Genetics 39: 1494–9.

33. SchadtEE, MolonyC, ChudinE, HaoK, YangX, et al. (2008) Mapping the genetic architecture of gene expression in human liver. PLoS Biology 6: e107.

34. ServinB, StephensM (2007) Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genetics 3: e114.

35. ScheetP, StephensM (2006) A fast and exible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. American journal of human genetics 78: 629–44.

36. LeekJT, StoreyJD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genetics 3: 1724–35.

37. PickrellJK, MarioniJC, PaiAA, DegnerJF, EngelhardtBE, et al. (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464: 768–772.

38. GuanY, StephensM (2008) Practical issues in imputation-based association mapping. PLoS Genetics 4: e1000279.

39. StephensM, BaldingDJ (2009) Bayesian statistical methods for genetic association studies. Nature Reviews Genetics 10: 681–90.

40. VeyrierasJB, KudaravalliS, KimSY, DermitzakisET, GiladY, et al. (2008) High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genetics 4: e1000214.

41. VeyrierasJB, GaffneyDJ, PickrellJK, GiladY, StephensM, et al. (2012) Exon-specific QTLs skew the inferred distribution of expression QTLs detected using gene expression array data. PloS One 7: e30629.

42. DegnerJF, PaiAA, Pique-RegiR, VeyrierasJB, GaffneyDJ, et al. (2012) DNaseI sensitivity QTLs are a major determinant of human expression variation. Nature 482: 390–4.

43. GaffneyDJ, VeyrierasJB, DegnerJF, Pique-RegiR, PaiAA, et al. (2012) Dissecting the regulatory architecture of gene expression QTLs. Genome biology 13: R7.

44. KimTH, AbdullaevZK, SmithAD, ChingKA, LoukinovDI, et al. (2007) Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128: 1231–45.

45. NègreN, BrownCD, ShahPK, KheradpourP, MorrisonCA, et al. (2010) A comprehensive map of insulator elements for the Drosophila genome. PLoS Genetics 6: e1000814.

46. ErnstJ, KheradpourP, MikkelsenTS, ShoreshN, WardLD, et al. (2011) Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473: 43–9.

47. CooperSJ, TrinkleinND, NguyenL, MyersRM (2007) Serum response factor binding sites differ in three human cell types. Genome Research 17: 136–44.

48. ChenH, TianY, ShuW, BoX, WangS (2012) Comprehensive identification and annotation of cell type-specific and ubiquitous CTCF-binding sites in the human genome. PloS One 7: e41374.

49. HindorffLA, SethupathyP, JunkinsHA, RamosEM, MehtaJP, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences of the United States of America 106: 9362–7.

50. PurcellS, NealeB, Todd-BrownK, ThomasL, FerreiraMAR, et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics 81: 559–75.

51. LiY, WillerCJ, DingJ, ScheetP, AbecasisGR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic epidemiology 34: 816–34.

52. HowieB, MarchiniJ, StephensM (2011) Genotype Imputation with Thousands of Genomes. G3: Genes, Genomes, Genetics 1: 457–470.

53. 1 kg (2010) A map of human genome variation from population-scale sequencing. Nature 467: 1061–73.

54. TroyanskayaO, CantorM, SherlockG, BrownP, HastieT, et al. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17: 520–525.

55. BarberMJ, MangraviteLM, HydeCL, ChasmanDI, SmithJD, et al. (2010) Genome-wide association of lipid-lowering response to statins in combined study populations. PloS One 5: e9763.

56. FraleyC, RafteryA, WehrensR (2005) Incremental Model-Based Clustering for Large Datasets With Small Clusters. Journal of Computational and Graphical Statistics 14: 529–546.

57. KudaravalliS, VeyrierasJB, StrangerBE, DermitzakisET, PritchardJK (2009) Gene expression levels are a target of recent natural selection in the human genome. Molecular biology and evolution 26: 649–58.

58. MaranvilleJC, LucaF, RichardsAL, WenX, WitonskyDB, et al. (2011) Interactions between glucocorticoid treatment and cis-regulatory polymorphisms contribute to cellular response phenotypes. PLoS Genetics 7: e1002162.

59. PetrettoE, BottoloL, LangleySR, HeinigM, McDermott-RoeC, et al. (2010) New insights into the genetic control of gene expression using a Bayesian multi-tissue approach. PLoS computational biology 6: e1000737.

60. FrazerKA, BallingerDG, CoxDR, HindsDA, StuveLL, et al. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449: 851–61.

61. Wen X, Stephens M (2011) Bayesian Methods for Genetic Association Analysis with Heterogeneous Subgroups: from Meta-Analyses to Gene-Environment Interactions. arXiv:1111.1210v2.

62. WangS, YehyaN, SchadtEE, WangH, DrakeTa, et al. (2006) Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genetics 2: e15.

63. StrangerBE, MontgomerySB, DimasAS, PartsL, StegleO, et al. (2012) Patterns of Cis Regulatory Variation in Diverse Human Populations. PLoS Genetics 8: e1002639.

64. StoreyJD, AkeyJM, KruglyakL (2005) Multiple locus linkage analysis of genomewide expression in yeast. PLoS Biology 3: e267.

65. AshburnerM, BallC, BlakeJ (2000) Gene Ontology: tool for the unification of biology. Nature Genetics 25: 25–29.

66. KanehisaM, GotoS, SatoY, FurumichiM, TanabeM (2012) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic acids research 40: D109–14.

67. Gardiner-GardenM, FrommerM (1987) CpG Islands in vertebrate genomes. Journal of Molecular Biology 196: 261–282.

68. CooperGM, StoneEA, AsimenosG, GreenED, BatzoglouS, et al. (2005) Distribution and intensity of constraint in mammalian genomic sequence. Genome Research 15: 901–13.

69. HoffmanMM, BuskeOJ, WangJ, WengZ, BilmesJA, et al. (2012) Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nature methods 9: 473–476.

70. BreimanL (2001) Random Forests. Machine Learning 45: 5–32.

71. LiawA, WienerM (2002) Classification and Regression by randomForest. R news 2: 18–22.

72. SingT, SanderO, BeerenwinkelN, LengauerT (2005) ROCR: visualizing classifier performance in R. Bioinformatics 21: 3940–1.

73. WendtKS, YoshidaK, ItohT, BandoM, KochB, et al. (2008) Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature 451: 796–801.

74. SimonJA, LinF, HulleySB, BlanchePJ, WatersD, et al. (2006) Phenotypic predictors of response to simvastatin therapy among African-Americans and Caucasians: the Cholesterol and Pharmacogenetics (CAP) Study. The American journal of cardiology 97: 843–50.

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2013 Číslo 8
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#