Combining Natural Sequence Variation with High Throughput Mutational Data to Reveal Protein Interaction Sites

English version České info

The interactions of proteins with each other are essential for almost all biological processes. Many of the sites of protein contact have evolved to maintain these interactions, but use different sets of amino acid residues. As a result, the residues at a contact site in a protein from one species might not allow a protein interaction when they are tested in a second species. This property underlies the idea of inter-species complementation assays, which test the effect of replacing protein segments from one species by their equivalents from another species. However, this approach has been highly limited in the number of changes that could be analyzed in a single study. Here, we present a novel approach that combines a high-throughput analysis of mutations in a single protein with the set of natural sequences corresponding to evolutionarily divergent variants of this protein. This integration step allows us to map at high resolution both sites of inter-protein interaction as well as intra-protein interaction. Our approach can be used with proteins that have limited functional and structural data, and it can be applied to improve the performance of computational tools that use sequence homology to predict function.

Vyšlo v časopise: Combining Natural Sequence Variation with High Throughput Mutational Data to Reveal Protein Interaction Sites. PLoS Genet 11(2): e32767. doi:10.1371/journal.pgen.1004918
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004918

Souhrn

Zdroje

1. Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257 : 342–358. 8609628

2. de Juan D, Pazos F, Valencia A (2013) Emerging methods in protein co-evolution. Nat Rev Genet 14 : 249–261.

3. Lovell SC, Robertson DL (2010) An integrated view of molecular coevolution in protein-protein interactions. Mol Biol Evol 27 : 2567–2575. doi: 10.1093/molbev/msq144 20551042

4. Ovchinnikov S, Kamisetty H, Baker D (2014) Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. Elife: e02030.

5. Chakrabarti S, Lanczycki CJ (2007) Analysis and prediction of functionally important sites in proteins. Protein Sci 16 : 4–13. 17192586

6. Cheng G, Qian B, Samudrala R, Baker D (2005) Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. Nucleic Acids Res 33 : 5861–5867. 16224101

7. Mintseris J, Weng Z (2005) Structure, function, and evolution of transient and obligate protein-protein interactions. Proc Natl Acad Sci U S A 102 : 10930–10935. 16043700

8. Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5 : 823–826. 3709526

9. Marini NJ, Thomas PD, Rine J (2010) The use of orthologous sequences to predict the impact of amino acid substitutions on protein function. PLoS Genet 6: e1000968. 20523748

10. Harrison JS, Burton RS (2006) Tracing hybrid incompatibilities to single amino acid substitutions. Mol Biol Evol 23 : 559–564. 16280539

11. Mody A, Weiner J, Ramanathan S (2009) Modularity of MAP kinases allows deformation of their signalling pathways. Nat Cell Biol 11 : 484–491. doi: 10.1038/ncb1856 19295513

12. Otero LJ, Ashe MP, Sachs AB (1999) The yeast poly(A)-binding protein Pab1p stimulates in vitro poly(A)-dependent and cap-dependent translation by distinct mechanisms. EMBO J 18 : 3153–3163. 10357826

13. Sowa ME, He W, Slep KC, Kercher MA, Lichtarge O, et al. (2001) Prediction and confirmation of a site critical for effector regulation of RGS domain activity. Nat Struct Biol 8 : 234–237. 11224568

14. Araya CL, Fowler DM (2011) Deep mutational scanning: assessing protein function on a massive scale. Trends Biotechnol 29 : 435–442. doi: 10.1016/j.tibtech.2011.04.003 21561674

15. Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, et al. (2010) High-resolution mapping of protein sequence-function relationships. Nat Methods 7 : 741–746. doi: 10.1038/nmeth.1492 20711194

16. Melamed D, Young DL, Gamble CE, Miller CR, Fields S (2013) Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA 19 : 1537–1551. doi: 10.1261/rna.040709.113 24064791

17. Adam SA, Nakagawa T, Swanson MS, Woodruff TK, Dreyfuss G (1986) mRNA polyadenylate-binding protein: gene isolation and sequencing and identification of a ribonucleoprotein consensus sequence. Mol Cell Biol 6 : 2932–2943. 3537727

18. Mangus DA, Evans MC, Jacobson A (2003) Poly(A)-binding proteins: multifunctional scaffolds for the post-transcriptional control of gene expression. Genome Biol 4 : 223. 12844354

19. Sachs AB, Bond MW, Kornberg RD (1986) A single gene from yeast for both nuclear and cytoplasmic polyadenylate-binding proteins: domain structure and expression. Cell 45 : 827–835. 3518950

20. Burd CG, Matunis EL, Dreyfuss G (1991) The multiple RNA-binding domains of the mRNA poly(A)-binding protein have different RNA-binding activities. Mol Cell Biol 11 : 3419–3424. 1675426

21. Sachs AB, Davis RW, Kornberg RD (1987) A single domain of yeast poly(A)-binding protein is necessary and sufficient for RNA binding and cell viability. Mol Cell Biol 7 : 3268–3276. 3313012

22. Kessler SH, Sachs AB (1998) RNA recognition motif 2 of yeast Pab1p is required for its functional interaction with eukaryotic translation initiation factor 4G. Mol Cell Biol 18 : 51–57. 9418852

23. Amrani N, Ghosh S, Mangus DA, Jacobson A (2008) Translation factors promote the formation of two states of the closed-loop mRNP. Nature 453 : 1276–1280. doi: 10.1038/nature06974 18496529

24. Jacobson A, Favreau M (1983) Possible involvement of poly(A) in protein synthesis. Nucleic Acids Res 11 : 6353–6368. 6137807

25. Wells SE, Hillner PE, Vale RD, Sachs AB (1998) Circularization of mRNA by eukaryotic translation initiation factors. Mol Cell 2 : 135–140. 9702200

26. Goyer C, Altmann M, Lee HS, Blanc A, Deshmukh M, et al. (1993) TIF4631 and TIF4632: two yeast genes encoding the high-molecular-weight subunits of the cap-binding protein complex (eukaryotic initiation factor 4F) contain an RNA recognition motif-like sequence and carry out an essential function. Mol Cell Biol 13 : 4860–4874. 8336723

27. Walhout AJ, Boulton SJ, Vidal M (2000) Yeast two-hybrid systems and protein interaction mapping projects for yeast and worm. Yeast 17 : 88–94. 10900455

28. Orr HA (1995) The population genetics of speciation: the evolution of hybrid incompatibilities. Genetics 139 : 1805–1813. 7789779

29. Gribaldo S, Philippe H (2002) Ancient phylogenetic relationships. Theor Popul Biol 61 : 391–408.

30. Deo RC, Bonanno JB, Sonenberg N, Burley SK (1999) Recognition of polyadenylate RNA by the poly(A)-binding protein. Cell 98 : 835–845. 10499800

31. Safaee N, Kozlov G, Noronha AM, Xie J, Wilds CJ, et al. (2012) Interdomain allostery promotes assembly of the poly(A) mRNA complex with PABP and eIF4G. Mol Cell 48 : 375–386. doi: 10.1016/j.molcel.2012.09.001 23041282

32. Caffrey DR, Somaroo S, Hughes JD, Mintseris J, Huang ES (2004) Are protein-protein interfaces more conserved in sequence than the rest of the protein surface? Protein Sci 13 : 190–202. 14691234

33. Kondrashov AS, Sunyaev S, Kondrashov FA (2002) Dobzhansky-Muller incompatibilities in protein evolution. Proc Natl Acad Sci U S A 99 : 14878–14883.

34. Soylemez O, Kondrashov FA (2012) Estimating the rate of irreversibility in protein evolution. Genome Biol Evol 4 : 1213–1222. doi: 10.1093/gbe/evs096 23132897

35. Bordner AJ, Abagyan R (2005) Statistical analysis and prediction of protein-protein interfaces. Proteins 60 : 353–366. 15906321

36. Clarkson BK, Gilbert WV, Doudna JA (2010) Functional overlap between eIF4G isoforms in Saccharomyces cerevisiae. PLoS One 5: e9114. doi: 10.1371/journal.pone.0009114 20161741

37. Tarun SZ Jr, Sachs AB (1996) Association of the yeast poly(A) tail binding protein with translation initiation factor eIF-4G. EMBO J 15 : 7168–7177. 9003792

38. Tarun SZ Jr, Wells SE, Deardorff JA, Sachs AB (1997) Translation initiation factor eIF4G mediates in vitro poly(A) tail-dependent translation. Proc Natl Acad Sci U S A 94 : 9046–9051. 9256432

39. Giaever G, Chu AM, Ni L, Connelly C, Riles L, et al. (2002) Functional profiling of the Saccharomyces cerevisiae genome. Nature 418 : 387–391. 12140549

40. Dunham MJ, Fowler DM (2013) Contemporary, yeast-based approaches to understanding human genetic variation. Curr Opin Genet Dev 23 : 658–664. doi: 10.1016/j.gde.2013.10.001 24252429

41. Zhang N, Osborn M, Gitsham P, Yen K, Miller JR, et al. (2003) Using yeast to place human genes in functional categories. Gene 303 : 121–129. 12559573

42. Magrane M, Consortium U (2011) UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) 2011: bar009. 21447597

43. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, et al. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7 : 539. doi: 10.1038/msb.2011.75 21988835

44. Willard L, Ranjan A, Zhang H, Monzavi H, Boyko RF, et al. (2003) VADAR: a web server for quantitative evaluation of protein structure quality. Nucleic Acids Res 31 : 3316–3319. 12824316

45. Saldanha AJ (2004) Java Treeview—extensible visualization of microarray data. Bioinformatics 20 : 3246–3248. 15180930

46. de Hoon MJ, Imoto S, Nolan J, Miyano S (2004) Open source clustering software. Bioinformatics 20 : 1453–1454.

47. Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, et al. (2008) Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res 36: W465–469. doi: 10.1093/nar/gkn180 18424797

48. Ashkenazy H, Penn O, Doron-Faigenboim A, Cohen O, Cannarozzi G, et al. (2012) FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res 40: W580–584. doi: 10.1093/nar/gks498 22661579