Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Development
Transcription factors that drive complex patterns of gene expression during animal development bind to thousands of genomic regions, with quantitative differences in binding across bound regions mediating their activity. While we now have tools to characterize the DNA affinities of these proteins and to precisely measure their genome-wide distribution in vivo, our understanding of the forces that determine where, when, and to what extent they bind remains primitive. Here we use a thermodynamic model of transcription factor binding to evaluate the contribution of different biophysical forces to the binding of five regulators of early embryonic anterior-posterior patterning in Drosophila melanogaster. Predictions based on DNA sequence and in vitro protein-DNA affinities alone achieve a correlation of ∼0.4 with experimental measurements of in vivo binding. Incorporating cooperativity and competition among the five factors, and accounting for spatial patterning by modeling binding in every nucleus independently, had little effect on prediction accuracy. A major source of error was the prediction of binding events that do not occur in vivo, which we hypothesized reflected reduced accessibility of chromatin. To test this, we incorporated experimental measurements of genome-wide DNA accessibility into our model, effectively restricting predicted binding to regions of open chromatin. This dramatically improved our predictions to a correlation of 0.6–0.9 for various factors across known target genes. Finally, we used our model to quantify the roles of DNA sequence, accessibility, and binding competition and cooperativity. Our results show that, in regions of open chromatin, binding can be predicted almost exclusively by the sequence specificity of individual factors, with a minimal role for protein interactions. We suggest that a combination of experimentally determined chromatin accessibility data and simple computational models of transcription factor binding may be used to predict the binding landscape of any animal transcription factor with significant precision.
Vyšlo v časopise:
Quantitative Models of the Mechanisms That Control Genome-Wide Patterns of Transcription Factor Binding during Early Development. PLoS Genet 7(2): e32767. doi:10.1371/journal.pgen.1001290
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1001290
Souhrn
Transcription factors that drive complex patterns of gene expression during animal development bind to thousands of genomic regions, with quantitative differences in binding across bound regions mediating their activity. While we now have tools to characterize the DNA affinities of these proteins and to precisely measure their genome-wide distribution in vivo, our understanding of the forces that determine where, when, and to what extent they bind remains primitive. Here we use a thermodynamic model of transcription factor binding to evaluate the contribution of different biophysical forces to the binding of five regulators of early embryonic anterior-posterior patterning in Drosophila melanogaster. Predictions based on DNA sequence and in vitro protein-DNA affinities alone achieve a correlation of ∼0.4 with experimental measurements of in vivo binding. Incorporating cooperativity and competition among the five factors, and accounting for spatial patterning by modeling binding in every nucleus independently, had little effect on prediction accuracy. A major source of error was the prediction of binding events that do not occur in vivo, which we hypothesized reflected reduced accessibility of chromatin. To test this, we incorporated experimental measurements of genome-wide DNA accessibility into our model, effectively restricting predicted binding to regions of open chromatin. This dramatically improved our predictions to a correlation of 0.6–0.9 for various factors across known target genes. Finally, we used our model to quantify the roles of DNA sequence, accessibility, and binding competition and cooperativity. Our results show that, in regions of open chromatin, binding can be predicted almost exclusively by the sequence specificity of individual factors, with a minimal role for protein interactions. We suggest that a combination of experimentally determined chromatin accessibility data and simple computational models of transcription factor binding may be used to predict the binding landscape of any animal transcription factor with significant precision.
Zdroje
1. WalterJ
DeverCA
BigginMD
1994 Two homeo domain proteins bind with similar specificity to a wide range of DNA sites in Drosophila embryos. Genes Dev 8 1678 1692
2. CarrA
BigginMD
1999 A comparison of in vivo and in vitro DNA-binding specificities suggests a new model for homeoprotein DNA binding in Drosophila embryos. EMBO Journal 18 1598 1608
3. BoyerLA
LeeTI
ColeMF
JohnstoneSE
LevineSS
2005 Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122 947 956
4. BiedaM
XuX
SingerMA
GreenR
FarnhamPJ
2006 Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res 16 595 605
5. YangA
ZhuZ
KapranovP
McKeonF
ChurchGM
2006 Relationships between p63 binding, DNA sequence, transcription activity, and biological function in human cells. Mol Cell 24 593 602
6. SandmannT
GirardotC
BrehmeM
TongprasitW
StolcV
2007 A core transcriptional network for early mesoderm development in Drosophila melanogaster. Genes Dev 21 436 449
7. ZeitlingerJ
ZinzenRP
StarkA
KellisM
ZhangH
2007 Whole-genome ChIP-chip analysis of Dorsal, Twist, and Snail suggests integration of diverse patterning processes in the Drosophila embryo. Genes Dev 21 385 390
8. JohnsonDS
MortazaviA
MyersRM
WoldB
2007 Genome-wide mapping of in vivo protein-DNA interactions. Science 316 1497 1502
9. RobertsonG
HirstM
BainbridgeM
BilenkyM
ZhaoY
2007 Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4 651 657
10. ChenX
XuH
YuanP
FangF
HussM
2008 Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133 1106 1117
11. ConsortiumTEP
2007 Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447 799 816
12. GeorletteD
AhnS
MacAlpineDM
CheungE
LewisPW
2007 Genomic profiling and expression studies reveal both positive and negative activities for the Drosophila Myb MuvB/dREAM complex in proliferating cells. Genes Dev 21 2880 2896
13. LiX-Y
MacarthurS
BourgonR
NixD
PollardDA
2008 Transcription Factors Bind Thousands of Active and Inactive Regions in the Drosophila Blastoderm. PLoS Biol 6 e27 doi:10.1371/journal.pbio.0060027
14. FullwoodMJ
LiuMH
PanYF
LiuJ
XuH
2009 An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462 58 64
15. BojSF
ServitjaJM
MartinD
RiosM
TalianidisI
2009 Functional targets of the monogenic diabetes transcription factors HNF-1alpha and HNF-4alpha are highly conserved between mice and humans. Diabetes 58 1245 1253
16. MacarthurS
LiX-Y
LiJ
BrownJB
ChuHC
2009 Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biol 10 R80
17. CaoY
YaoZ
SarkarD
LawrenceM
SanchezGJ
2010 Genome-wide MyoD binding in skeletal muscle cells: a potential for broad cellular reprogramming. Dev Cell 18 662 674
18. BradleyRK
LiX-Y
TrapnellC
DavidsonS
PachterL
2010 Binding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species. PLoS Biol 8 e1000343 doi:10.1371/journal.pbio.1000343
19. LiangZ
BigginMD
1998 Eve and ftz regulate a wide array of genes in blastoderm embryos: the selector homeoproteins directly or indirectly regulate most genes in Drosophila. Development 125 4471 4482
20. MoormanC
SunLV
WangJ
de WitE
TalhoutW
2006 Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster. Proc Natl Acad Sci U S A 103 12027 12032
21. OuyangZ
ZhouQ
WongWH
2009 ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells. Proc Natl Acad Sci U S A 106 21521 21526
22. ZinzenRP
GirardotC
GagneurJ
BraunM
FurlongEEM
2009 Combinatorial binding predicts spatio-temporal cis-regulatory activity. Nature 461 65 70
23. WunderlichZ
MirnyLA
2009 Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet 25 434 440
24. IyerVR
HorakCE
ScafeCS
BotsteinD
SnyderM
2001 Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409 533 538
25. LiuX
LeeCK
GranekJA
ClarkeND
LiebJD
2006 Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selection. Genome Res 16 1517 1528
26. StanojevicD
SmallS
LevineM
1991 Regulation of a segmentation stripe by overlapping activators and repressors in the Drosophila embryo. Science (New York, NY) 254 1385 1387
27. MakeevVJ
LifanovAP
NazinaAG
PapatsenkoDA
2003 Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information. Nucleic Acids Research 31 6016 6026
28. WolffeAP
1994 Nucleosome positioning and modification: chromatin structures that potentiate transcription. Trends Biochem Sci 19 240 244
29. CosmaMP
TanakaT
NasmythK
1999 Ordered recruitment of transcription and chromatin remodeling factors to a cell cycle- and developmentally regulated promoter. Cell 97 299 311
30. AgaliotiT
LomvardasS
ParekhB
YieJ
ManiatisT
2000 Ordered recruitment of chromatin modifying and general transcription factors to the IFN-beta promoter. Cell 103 667 678
31. CarrA
BigginMD
2000 Accessibility of transcriptionally inactive genes is specifically reduced at homeoprotein-DNA binding sites in Drosophila. Nucleic Acids Res 28 2839 2846
32. NarlikarGJ
FanH-Y
KingstonRE
2002 Cooperation between complexes that regulate chromatin structure and transcription. Cell 108 475 487
33. MorseRH
2007 Transcription factor access to promoter elements. J Cell Biochem 102 560 570
34. TaylorIC
WorkmanJL
SchuetzTJ
KingstonRE
1991 Facilitated binding of GAL4 and heat shock factor to nucleosomal templates: differential function of DNA-binding domains. Genes Dev 5 1285 1298
35. AdamsCC
WorkmanJL
1995 Binding of disparate transcriptional activators to nucleosomal DNA is inherently cooperative. Mol Cell Biol 15 1405 1421
36. JohnsonAD
1995 Molecular mechanisms of cell-type determination in budding yeast. Curr Opin Genet Dev 5 552 558
37. VasheeS
MelcherK
DingWV
JohnstonSA
KodadekT
1998 Evidence for two modes of cooperative DNA binding in vivo that do not involve direct protein-protein interactions. Curr Biol 8 452 458
38. BolouriH
DavidsonEH
2003 Transcriptional regulatory cascades in development: initial rates, not steady state, determine network kinetics. Proc Natl Acad Sci U S A 100 9371 9376
39. MillerJA
WidomJ
2003 Collaborative competition mechanism for gene activation in vivo. Mol Cell Biol 23 1623 1632
40. ZeitlingerJ
SimonI
HarbisonCT
HannettNM
VolkertTL
2003 Program-specific distribution of a transcription factor dependent on partner transcription factor and MAPK signaling. Cell 113 395 404
41. BuckMJ
LiebJD
2006 A chromatin-mediated mechanism for specification of conditional transcription factor targets. Nat Genet 38 1446 1451
42. MannRS
LelliKM
JoshiR
2009 Hox specificity unique roles for cofactors and collaborators. Curr Top Dev Biol 88 63 101
43. Campos-OrtegaJA
HartensteinV
1997 The Embryonic Development of Drosophila melanogaster. Berlin Springer-Verlag
44. Nusslein-VolhardC
WieschausE
1980 Mutations affecting segment number and polarity in Drosophila. Nature 287 795 801
45. St JohnstonD
Nusslein-VolhardC
1992 The origin of pattern and polarity in the Drosophila embryo. Cell 68 201 219
46. Rivera-PomarR
JäckleH
1996 From gradients to stripes in Drosophila embryogenesis: filling in the gaps. Trends Genet 12 478 483
47. FowlkesCC
HendriksCLL
KeränenSVE
WeberGH
RübelO
2008 A quantitative spatiotemporal atlas of gene expression in the Drosophila blastoderm. Cell 133 364 374
48. StormoGD
2000 DNA binding sites: representation and discovery. Bioinformatics 16 16 23
49. FrithMC
HansenU
WengZ
2001 Detection of cis-element clusters in higher eukaryotic DNA. Bioinformatics 17 878 889
50. RajewskyN
VergassolaM
GaulU
SiggiaED
2002 Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo. BMC Bioinformatics 3 30
51. BarashY
ElidanG
FriedmanN
KaplanT
2003 Modeling dependencies in protein-DNA binding sites. Proceedings of the seventh annual international conference on Research in computational molecular biology Berlin, Germany ACM 28 37
52. BulykML
2003 Computational prediction of transcription-factor binding site locations. Genome Biol 5 201
53. SinhaS
van NimwegenE
SiggiaED
2003 A probabilistic method to detect regulatory modules. Bioinformatics 19 Suppl 1 i292 301
54. BarashY
ElidanG
KaplanT
FriedmanN
2005 CIS: compound importance sampling method for protein-DNA binding site p-value estimation. Bioinformatics 21 596 600
55. GranekJA
ClarkeND
2005 Explicit equilibrium modeling of transcription-factor binding and gene regulation. Genome Biol 6 R87
56. SinhaS
2006 On counting position weight matrix matches in a sequence, with application to discriminative motif finding. Bioinformatics 22 e454 463
57. NarlikarL
GordanR
HarteminkAJ
2007 A nucleosome-guided map of transcription factor binding sites in yeast. PLoS Comput Biol 3 e215 doi:10.1371/journal.pcbi.0030215
58. RoiderHG
KanhereA
MankeT
VingronM
2007 Predicting transcription factor affinities to DNA from a biophysical model. Bioinformatics 23 134 141
59. WardLD
BussemakerHJ
2008 Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences. Bioinformatics 24 i165 171
60. HeX
ChenCC
HongF
FangF
SinhaS
2009 A biophysical model for analysis of transcription factor interaction and binding site arrangement from genome-wide binding data. PLoS ONE 4 e8155 doi:10.1371/journal.pone.0008155
61. NarlikarL
OvcharenkoI
2009 Identifying regulatory elements in eukaryotic genomes. Brief Funct Genomic Proteomic 8 215 230
62. WassonT
HarteminkAJ
2009 An ensemble model of competitive multi-factor binding of the genome. Genome Res 19 2101 2112
63. WhitingtonT
PerkinsAC
BaileyTL
2009 High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites. Nucleic Acids Research 37 14 25
64. WonKJ
AgarwalS
ShenL
ShoemakerR
RenB
2009 An integrated approach to identifying cis-regulatory modules in the human genome. PLoS ONE 4 e5501 doi:10.1371/journal.pone.0005501
65. ErnstJ
PlastererHL
SimonI
Bar-JosephZ
2010 Integrating multiple evidence sources to predict transcription factor binding in the human genome. Genome Res 20 526 536
66. HeX
SameeMA
BlattiC
SinhaS
2010 Thermodynamics-based models of transcriptional regulation by enhancers: the roles of synergistic activation, cooperative binding and short-range repression. PLoS Comput Biol 6 e1000935 doi:10.1371/journal.pcbi.1000935
67. RamseySA
KnijnenburgTA
KennedyKA
ZakDE
GilchristM
2010 Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites. Bioinformatics 26 2071 2075
68. WonK-J
RenB
WangW
2010 Genome-wide prediction of transcription factor binding sites using an integrated model. Genome Biol 11 R7
69. SegalE
Raveh-SadkaT
SchroederM
UnnerstallU
GaulU
2008 Predicting expression patterns from regulatory sequence in Drosophila segmentation. Nature 451 535 540
70. Raveh-SadkaT
LevoM
SegalE
2009 Incorporating nucleosomes into thermodynamic models of transcription regulation. Genome Res 19 1480 1496
71. KazemianM
BlattiC
RichardsA
McCutchanM
Wakabayashi-ItoN
2010 Quantitative analysis of the Drosophila segmentation regulatory network using pattern generating potentials. PLoS Biol 8 e1000456 doi:10.1371/journal.pbio.1000456
72. LinS
RiggsAD
1975 The general affinity of lac repressor for E. coli DNA: implications for gene regulation in procaryotes and eucaryotes. Cell 4 107 111
73. von HippelPH
RevzinA
GrossCA
WangAC
1974 Nonspecific DNA binding of genome regulating proteins as a biological control mechanism: 1. The lac operon: Equilibrium aspects. Proc Natl Acad Sci USA 71 4808 4812
74. KulpD
HausslerD
ReeseMG
EeckmanFH
1996 A generalized hidden Markov model for the recognition of human genes in DNA. Proc Int Conf Intell Syst Mol Biol 4 134 142
75. RabinerL
1989 A Tutorial on Hidden Markov Moldes and Selected Applications in Speech Recognition. P Ieee 77 257 286
76. AckersGK
JohnsonAD
SheaMA
1982 Quantitative model for gene regulation by lambda phage repressor. Proc Natl Acad Sci USA 79 1129 1133
77. BuchlerNE
GerlandU
HwaT
2003 On schemes of combinatorial transcription logic. Proc Natl Acad Sci USA 100 5136 5141
78. SchroederMD
PearceM
FakJ
FanH
UnnerstallU
2004 Transcriptional control in the segmentation gene network of Drosophila. PLoS Biol 2 e271 doi:10.1371/journal.pbio.0020271
79. BintuL
BuchlerNE
GarciaHG
GerlandU
HwaT
2005 Transcriptional regulation by the numbers: models. Current Opinion in Genetics & Development 15 116 124
80. GertzJ
CohenBA
2009 Environment-specific combinatorial cis-regulation in synthetic promoters. Mol Syst Biol 5 1 9
81. TothJ
BigginMD
2000 The specificity of protein-DNA crosslinking by formaldehyde: in vitro and in drosophila embryos. Nucleic Acids Res 28 e4
82. AuerbachRK
EuskirchenG
RozowskyJ
Lamarre-VincentN
MoqtaderiZ
2009 Mapping accessible chromatin regions using Sono-Seq. Proc Natl Acad Sci USA 106 14926 14931
83. CapaldiA
KaplanT
LiuY
HabibN
RegevA
2008 Structure and function of a transcriptional network activated by the MAPK Hog1. Nat Genet 40 1300 1306
84. BrodyT
1999 The Interactive Fly: gene networks, development and the Internet. Trends Genet 15 333 334
85. HareEE
PetersonBK
IyerVN
MeierR
EisenMB
2008 Sepsid even-skipped enhancers are functionally conserved in Drosophila despite lack of sequence conservation. PLoS Genet 4 e1000106 doi:10.1371/journal.pgen.1000106
86. KimJ
HeX
SinhaS
2009 Evolution of regulatory sequences in 12 Drosophila species. PLoS Genet 5 e1000330 doi:10.1371/journal.pgen.1000330
87. PolachKJ
WidomJ
1996 A model for the cooperative binding of eukaryotic regulatory proteins to nucleosomal target sites. J Mol Biol 258 800 812
88. MirnyL
2009 Nucleosome-mediated cooperativity between transcription factors. Arxiv preprint arXiv 09012905
89. GrossDS
GarrardWT
1988 Nuclease hypersensitive sites in chromatin. Annu Rev Biochem 57 159 197
90. HesselberthJR
ChenX
ZhangZ
SaboPJ
SandstromR
2009 Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat Meth 6 283 289
91. SaboPJ
KuehnMS
ThurmanR
JohnsonBE
JohnsonEM
2006 Genome-scale mapping of DNase I sensitivity in vivo using tiling DNA microarrays. Nat Meth 3 511 518
92. ZhangY
MoqtaderiZ
RattnerBP
EuskirchenG
SnyderM
2009 Intrinsic histone-DNA interactions are not the major determinant of nucleosome positions in vivo. Nat Struct Mol Biol 16 847 852
93. SegalE
Fondufe-MittendorfY
ChenL
ThåströmA
FieldY
2006 A genomic code for nucleosome positioning. Nature 442 772 778
94. SmallS
BlairA
LevineM
1992 Regulation of even-skipped stripe 2 in the Drosophila embryo. EMBO J 11 4047 4057
95. ArnostiDN
BaroloS
LevineM
SmallS
1996 The eve stripe 2 enhancer employs multiple modes of transcriptional synergy. Development 122 205 214
96. KulkarniMM
ArnostiDN
2005 cis-regulatory logic of short-range transcriptional repression in Drosophila melanogaster. Molecular and Cellular Biology 25 3411 3420
97. FakhouriWD
AyA
SayalR
DreschJ
DayringerE
2010 Deciphering a transcriptional regulatory code: modeling short-range repression in the Drosophila embryo. Molecular Systems Biology 6 1 14
98. HammersleyJM
HandscombDC
1964 Monte Carlo methods. London, New York Methuen; Wiley vii, 178
99. SteihaugT
1983 The Conjugate Gradient Method and Trust Regions in Large Scale Optimization. SIAM Journal on Numerical Analysis 20 626 637
100. ColemanTF
LiY
1996 An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds. SIAM J Optim 6 418 445
101. ArbeitmanM
FurlongE
ImamF
JohnsonE
NullB
2002 Gene expression during the life cycle of Drosophila melanogaster. Science 297 2270 2275
102. AgiusP
ArveyA
ChangW
NobleWS
LeslieC
2010 High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions. PLoS Comput Biol 6 e1000916 doi:10.1371/journal.pcbi.1000916
103. ZinzenRP
SengerK
LevineM
PapatsenkoD
2006 Computational models for neurogenic gene expression in the Drosophila embryo. Curr Biol 16 1358 1365
104. RouletE
BussoS
CamargoAA
SimpsonAJ
MermodN
2002 High-throughput SELEX SAGE method for quantitative modeling of transcription-factor binding sites. Nat Biotechnol 20 831 835
105. NoyesMB
MengX
WakabayashiA
SinhaS
BrodskyMH
2008 A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system. Nucleic Acids Research 36 2547 2560
106. LangmeadB
TrapnellC
PopM
SalzbergSL
2009 Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10 R25
107. GoldbergD
HollandJ
1988 Genetic Algorithms and Machine Learning. Machine learning 3 95 99
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2011 Číslo 2
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Meta-Analysis of Genome-Wide Association Studies in Celiac Disease and Rheumatoid Arthritis Identifies Fourteen Non-HLA Shared Loci
- MiRNA Control of Vegetative Phase Change in Trees
- The Cardiac Transcription Network Modulated by Gata4, Mef2a, Nkx2.5, Srf, Histone Modifications, and MicroRNAs
- Break to Make a Connection