Correlated Evolution of Nearby Residues in Drosophilid Proteins
Here we investigate the correlations between coding sequence substitutions as a function of their separation along the protein sequence. We consider both substitutions between the reference genomes of several Drosophilids as well as polymorphisms in a population sample of Zimbabwean Drosophila melanogaster. We find that amino acid substitutions are “clustered” along the protein sequence, that is, the frequency of additional substitutions is strongly enhanced within ≈10 residues of a first such substitution. No such clustering is observed for synonymous substitutions, supporting a “correlation length” associated with selection on proteins as the causative mechanism. Clustering is stronger between substitutions that arose in the same lineage than it is between substitutions that arose in different lineages. We consider several possible origins of clustering, concluding that epistasis (interactions between amino acids within a protein that affect function) and positional heterogeneity in the strength of purifying selection are primarily responsible. The role of epistasis is directly supported by the tendency of nearby substitutions that arose on the same lineage to preserve the total charge of the residues within the correlation length and by the preferential cosegregation of neighboring derived alleles in our population sample. We interpret the observed length scale of clustering as a statistical reflection of the functional locality (or modularity) of proteins: amino acids that are near each other on the protein backbone are more likely to contribute to, and collaborate toward, a common subfunction.
Vyšlo v časopise:
Correlated Evolution of Nearby Residues in Drosophilid Proteins. PLoS Genet 7(2): e32767. doi:10.1371/journal.pgen.1001315
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1001315
Souhrn
Here we investigate the correlations between coding sequence substitutions as a function of their separation along the protein sequence. We consider both substitutions between the reference genomes of several Drosophilids as well as polymorphisms in a population sample of Zimbabwean Drosophila melanogaster. We find that amino acid substitutions are “clustered” along the protein sequence, that is, the frequency of additional substitutions is strongly enhanced within ≈10 residues of a first such substitution. No such clustering is observed for synonymous substitutions, supporting a “correlation length” associated with selection on proteins as the causative mechanism. Clustering is stronger between substitutions that arose in the same lineage than it is between substitutions that arose in different lineages. We consider several possible origins of clustering, concluding that epistasis (interactions between amino acids within a protein that affect function) and positional heterogeneity in the strength of purifying selection are primarily responsible. The role of epistasis is directly supported by the tendency of nearby substitutions that arose on the same lineage to preserve the total charge of the residues within the correlation length and by the preferential cosegregation of neighboring derived alleles in our population sample. We interpret the observed length scale of clustering as a statistical reflection of the functional locality (or modularity) of proteins: amino acids that are near each other on the protein backbone are more likely to contribute to, and collaborate toward, a common subfunction.
Zdroje
1. KimuraM
1983 The neutral theory of molecular evolution. Cambridge Cambridge University Press
2. GillespieJH
1991 The Causes of Molecular Evolution. Oxford Oxford University Press
3. HeyJ
1999 The neutralist, the y and the selectionist. Trends in Ecology & Evolution 14 35 38
4. NeiM
2005 Selectionism and neutralism in molecular evolution. Molecular Biology and Evolution 22 2318 2342
5. SellaG
PetrovDA
PrzeworskiM
AndolfattoP
2009 Pervasive natural selection in the drosophila genome? PLoS Genet 5 e1000495 doi:10.1371/journal.pgen.1000495
6. SmithNGC
Eyre-WalkerA
2002 Adaptive protein evolution in drosophila. Nature 415 1022 4
7. FayJC
WyckoffGJ
WuCI
2002 Testing the neutral theory of molecular evolution with genomic data from drosophila. Nature 415 1024 1026
8. McDonaldJH
KreitmanM
1991 Adaptive protein evolution at the adh locus in drosophila. Nature 351 652 4
9. Eyre-WalkerA
2006 The genomic rate of adaptive evolution. Trends in Ecology & Evolution 21 569 575
10. OhtaT
1992 The nearly neutral theory of molecular evolution. Annual Review of Ecology and Systematics 23 263 286
11. HughesAL
2007 Looking for darwin in all the wrong places: the misguided quest for positive selection at the nucleotide sequence level. Heredity 99 364 373
12. BrandenC
ToozeJ
1999 Introduction to protein structure. New York Garland Science
13. ChothiaC
LeskAM
1986 The relation between the divergence of sequence and structure in proteins. EMBO Journal 5 823 26
14. Olivier LichtargeHRB
CohenFE
1996 An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 342 358
15. FitchW
MarkowitzE
1970 An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem Genet 4 579 593
16. ZvelebilM
BartonG
TaylorW
SternbergM
1987 Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol 195 957 961
17. RidoutK
DixonC
FilatovD
2010 Positive selection differs between protein secondary structure elements in drosophila. Genome Biology and Evolution 2010 166 179
18. KirbyDA
MuseSV
StephanW
1995 Maintenance of pre-mrna secondary structure by epistatic selection. PNAS 92 9047 9051
19. StephanW
1996 The rate of compensatory evolution. Genetics 144 419 26
20. MeerMV
KondrashovAS
Artzy-RandrupY
KondrashovFA
2010 Compensatory evolution in mitochondrial trnas navigates valleys of low fitness. Nature 464 279 282
21. WhisstockJC
LeskAM
2004 Prediction of protein function from protein sequence and structure. Quarterly Reviews of Biophysics 36 307 340
22. NeherE
1994 How frequent are correlated changes in families of protein sequences? PNAS 91 98 102
23. LocklessSW
RanganathanR
1999 Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286 295 9
24. YeangCH
HausslerD
2007 Detecting coevolution in and among protein domains. PLoS Comput Biol 3 e211 doi:10.1371/journal.pcbi.0030211
25. BurgerL
van NimwegenE
2010 Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput Biol 6 e1000633 doi:10.1371/journal.pcbi.1000633
26. WangQ
LeeC
2007 Distinguishing functional amino acid covariation from background linkage disequilibrium in HIV protease and reverse transcriptase. PLoS ONE 2 e814 doi:10.1371/journal.pone.0000814
27. PoonAFY
SwensonLC
DongWWY
DengW
PondSLK
2010 Phylogenetic analysis of population-based and deep sequencing data to identify coevolving sites in the nef gene of hiv-1. MBE 27 819 832
28. SocolichM
LocklessSW
RussWP
LeeH
GardnerKH
2005 Evolutionary information for specifying a protein fold. Nature 437 512 8
29. ConsortiumDG
2007 Evolution of genes and genomes on the drosophila phylogeny. Nature 450 203 18
30. SchwartzS
KentWJ
SmitA
ZhangZ
BaertschR
2003 Human-mouse alignments with blastz. Genome Res 13 103 7
31. KarolchikD
BaertschR
DiekhansM
FureyTS
HinrichsA
2003 The ucsc genome browser database. Nucleic Acids Res 31 51 4
32. ColginLM
HackmannAFM
EmondMJ
MonnatRJ
2002 The unexpected landscape of in vivo somatic mutation in a human epithelial cell lineage. PNAS 99 1437 42
33. WangJ
GonzalezKD
ScaringeWA
TsaiK
LiuN
2007 Evidence for mutation showers. PNAS 104 8403 8
34. Fukami-KobayashiK
SchreiberD
BennerS
2002 Detecting compensatory covariation signals in protein evolution using reconstructed ancestral sequences. Journal of Molecular Biology 319 729 743
35. SlatkinM
2008 Linkage disequilibrium - understanding the evolutionary past and mapping the medical future. Nat Rev Genet 9 477 485
36. TakahasiKR
InnanH
2008 The direction of linkage disequilibrium: A new measure based on the ancestral-derived status of segregating alleles. Genetics 179 1705 1712
37. DavisBH
PoonAFY
WhitlockMC
2009 Compensatory mutations are repeatable and clustered within proteins. Proc Biol Sci 276 1823 7
38. BazykinGA
KondrashovFA
OgurtsovAY
SunyaevS
KondrashovAS
2004 Positive selection at sites of multiple amino acid replacements since rat-mouse divergence. Nature 429 558 62
39. BazykinGA
DushoffJ
LevinSA
KondrashovAS
2006 Bursts of nonsynonymous substitutions in HIV-1 evolution reveal instances of positive selection at conservative protein sites. PNAS 103 19396 401
40. OrrHA
2003 A minimum on the mean number of steps taken in adaptive walks. Journal of Theoretical Biology 220 241 247
41. KulathinalR
BettencourtB
HartlD
2004 Compensated deleterious mutations in insect genomes. Science 306 1553 4
42. WeinreichDM
DelaneyNF
DepristoMA
HartlDL
2006 Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312 111 4
43. SmithJM
HaighJ
1974 The hitch-hiking effect of a favourable gene. Genetical Research 23 23 35
44. RiceWR
1987 Genetic hitchhiking and the evolution of reduced genetic activity of the y sex chromosome. Genetics 116 161 167
45. BirkyCW
WalshJB
1988 Effects of linkage on rates of molecular evolution. PNAS 85 6414 6418
46. BartonNH
1995 Linkage and the limits to natural selection. Genetics 140 821 841
47. AndolfattoP
2005 Adaptive evolution of non-coding DNA in drosophila. Nature 437 1149 52
48. BegunDJ
HollowayAK
StevensK
HillierLW
PohYP
2007 Population genomics: Whole-genome analysis of polymorphism and divergence in drosophila simulans. PLoS Biol 5 e310 doi:10.1371/journal.pbio.0050310
49. ShapiroJA
HuangW
ZhangC
HubiszMJ
LuJ
2007 Adaptive genic evolution in the drosophila genomes. PNAS 104 2271 2276
50. HillWG
RoberstonA
1966 The effect of linkage on limits to artificial selection. Genetical Research 8 269 294
51. OrtlundEA
BridghamJT
RedinboMR
ThorntonJW
2007 Crystal structure of an ancient protein: evolution by conformational epistasis. Science 317 1544 8
52. YangZ
2007 Paml 4: Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24 1586 1591
53. TanayA
SiggiaED
2008 Sequence context affects the rate of short insertions and deletions in ies and primates. Genome Biol 9 R37
54. AndolfattoP
2007 Hitchhiking effects of recurrent beneficial amino acid substitutions in the drosophila melanogaster genome. Genome Research 17 1755 1762
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2011 Číslo 2
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
Najčítanejšie v tomto čísle
- Meta-Analysis of Genome-Wide Association Studies in Celiac Disease and Rheumatoid Arthritis Identifies Fourteen Non-HLA Shared Loci
- MiRNA Control of Vegetative Phase Change in Trees
- The Cardiac Transcription Network Modulated by Gata4, Mef2a, Nkx2.5, Srf, Histone Modifications, and MicroRNAs
- Genome-Wide Transcript Profiling of Endosperm without Paternal Contribution Identifies Parent-of-Origin–Dependent Regulation of