#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Translational Selection Is Ubiquitous in Prokaryotes


Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea.


Vyšlo v časopise: Translational Selection Is Ubiquitous in Prokaryotes. PLoS Genet 6(6): e32767. doi:10.1371/journal.pgen.1001004
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1001004

Souhrn

Codon usage bias in prokaryotic genomes is largely a consequence of background substitution patterns in DNA, but highly expressed genes may show a preference towards codons that enable more efficient and/or accurate translation. We introduce a novel approach based on supervised machine learning that detects effects of translational selection on genes, while controlling for local variation in nucleotide substitution patterns represented as sequence composition of intergenic DNA. A cornerstone of our method is a Random Forest classifier that outperformed previous distance measure-based approaches, such as the codon adaptation index, in the task of discerning the (highly expressed) ribosomal protein genes by their codon frequencies. Unlike previous reports, we show evidence that translational selection in prokaryotes is practically universal: in 460 of 461 examined microbial genomes, we find that a subset of genes shows a higher codon usage similarity to the ribosomal proteins than would be expected from the local sequence composition. These genes constitute a substantial part of the genome—between 5% and 33%, depending on genome size—while also exhibiting higher experimentally measured mRNA abundances and tending toward codons that match tRNA anticodons by canonical base pairing. Certain gene functional categories are generally enriched with, or depleted of codon-optimized genes, the trends of enrichment/depletion being conserved between Archaea and Bacteria. Prominent exceptions from these trends might indicate genes with alternative physiological roles; we speculate on specific examples related to detoxication of oxygen radicals and ammonia and to possible misannotations of asparaginyl–tRNA synthetases. Since the presence of codon optimizations on genes is a valid proxy for expression levels in fully sequenced genomes, we provide an example of an “adaptome” by highlighting gene functions with expression levels elevated specifically in thermophilic Bacteria and Archaea.


Zdroje

1. ChenSL

LeeW

HottesAK

ShapiroL

McAdamsHH

2004 Codon usage between genomes is constrained by genome-wide mutational processes. Proc Natl Acad Sci U S A 101 3480 3485

2. KnightRD

FreelandSJ

LandweberLF

2001 A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol 2 RESEARCH0010

3. DaubinV

PerriereG

2003 G+C3 structuring along the genome: a common feature in prokaryotes. Mol Biol Evol 20 471 483

4. LobryJR

SueokaN

2002 Asymmetric directional mutation pressures in bacteria. Genome Biol 3 RESEARCH0058

5. RochaEP

DanchinA

2002 Base composition bias might result from competition for metabolic resources. Trends Genet 18 291 294

6. ZeldovichKB

BerezovskyIN

ShakhnovichEI

2007 Protein and DNA sequence determinants of thermophilic adaptation. PLoS Comput Biol 3 e5 doi:10.1371/journal.pcbi.0030005

7. RochaEP

2004 The replication-related organization of bacterial genomes. Microbiology 150 1609 1627

8. DethlefsenL

SchmidtTM

2005 Differences in codon bias cannot explain differences in translational power among microbes. BMC Bioinformatics 6 3

9. KanayaS

YamadaY

KudoY

IkemuraT

1999 Studies of codon usage and tRNA genes of 18 unicellular organisms and quantification of Bacillus subtilis tRNAs: gene expression level and species-specific diversity of codon usage based on multivariate analysis. Gene 238 143 155

10. XiaX

1998 How optimized is the translational machinery in Escherichia coli, Salmonella typhimurium and Saccharomyces cerevisiae? Genetics 149 37 44

11. StoletzkiN

Eyre-WalkerA

2007 Synonymous codon usage in Escherichia coli: selection for translational accuracy. Mol Biol Evol 24 374 381

12. NajafabadiHS

LehmannJ

OmidiM

2007 Error minimization explains the codon usage of highly expressed genes in Escherichia coli. Gene 387 150 155

13. DittmarKA

SorensenMA

ElfJ

EhrenbergM

PanT

2005 Selective charging of tRNA isoacceptors induced by amino-acid starvation. EMBO Rep 6 151 157

14. OresicM

ShallowayD

1998 Specific correlations between relative synonymous codon usage and protein secondary structure. J Mol Biol 281 31 48

15. Kimchi-SarfatyC

OhJM

KimIW

SaunaZE

CalcagnoAM

2007 A “silent” polymorphism in the MDR1 gene changes substrate specificity. Science 315 525 528

16. CarboneA

2006 Computational prediction of genomic functional cores specific to different microbes. J Mol Evol 63 733 746

17. McInerneyJO

1998 Replicational and transcriptional selection on codon usage in Borrelia burgdorferi. Proc Natl Acad Sci U S A 95 10698 10703

18. LafayB

AthertonJC

SharpPM

2000 Absence of translationally selected synonymous codon usage bias in Helicobacter pylori. Microbiology 146 851 860

19. RispeC

DelmotteF

van HamRC

MoyaA

2004 Mutational and selective pressures on codon and amino acid usage in Buchnera, endosymbiotic bacteria of aphids. Genome Res 14 44 53

20. HerbeckJT

WallDP

WernegreenJJ

2003 Gene expression level influences amino acid usage, but not codon usage, in the tsetse fly endosymbiont Wigglesworthia. Microbiology 149 2585 2596

21. BanerjeeT

BasakS

GuptaSK

GhoshTC

2004 Evolutionary forces in shaping the codon and amino acid usages in Blochmannia floridanus. J Biomol Struct Dyn 22 13 23

22. CharlesH

CalevroF

VinuelasJ

FayardJM

RahbeY

2006 Codon usage bias and tRNA over-expression in Buchnera aphidicola after aromatic amino acid nutritional stress on its host Acyrthosiphon pisum. Nucleic Acids Res 34 4583 4592

23. dos ReisM

SavvaR

WernischL

2004 Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res 32 5036 5044

24. CarboneA

KepesF

ZinovyevA

2005 Codon bias signatures, organization of microorganisms in codon space, and lifestyle. Mol Biol Evol 22 547 561

25. SharpPM

BailesE

GrocockRJ

PedenJF

SockettRE

2005 Variation in the strength of selected codon usage bias among bacteria. Nucleic Acids Res 33 1141 1153

26. SupekF

VlahovicekK

2004 INCA: synonymous codon usage analysis and clustering by means of self-organizing map. Bioinformatics 20 2329 2330

27. SharpPM

LiWH

1987 The codon Adaptation Index–a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15 1281 1295

28. CarboneA

MaddenR

2005 Insights on the evolution of metabolic networks of unicellular translationally biased organisms from transcriptomic data and sequence analysis. J Mol Evol 61 456 469

29. MrazekJ

SpormannAM

KarlinS

2006 Genomic comparisons among gamma-proteobacteria. Environ Microbiol 8 273 288

30. PerriereG

ThioulouseJ

2002 Use and misuse of correspondence analysis in codon usage studies. Nucleic Acids Res 30 4548 4555

31. SuzukiH

SaitoR

TomitaM

2005 A problem in multivariate analysis of codon usage data and a possible solution. FEBS Lett 579 6499 6504

32. BreimanL

2001 Random forests. Machine Learning 45 5 32

33. KarlinS

MrazekJ

2000 Predicted highly expressed genes of diverse prokaryotic genomes. J Bacteriol 182 5238 5250

34. SupekF

VlahovicekK

2005 Comparison of codon usage measures and their applicability in prediction of microbial gene expressivity. BMC Bioinformatics 6 182

35. GrocockRJ

SharpPM

2002 Synonymous codon usage in Pseudomonas aeruginosa PA01. Gene 289 131 139

36. WeinerRM

TaylorLE2nd

HenrissatB

HauserL

LandM

2008 Complete genome sequence of the complex carbohydrate-degrading marine bacterium, Saccharophagus degradans strain 2-40 T. PLoS Genet 4 e1000087 doi:10.1371/journal.pgen.1000087

37. LawrenceJG

OchmanH

1997 Amelioration of bacterial genomes: rates of change and exchange. J Mol Evol 44 383 397

38. KarlinS

1998 Global dinucleotide signatures and analysis of genomic heterogeneity. Curr Opin Microbiol 1 598 610

39. ReschAM

CarmelL

Marino-RamirezL

OgurtsovAY

ShabalinaSA

2007 Widespread positive selection in synonymous sites of mammalian genes. Mol Biol Evol 24 1821 1831

40. KarlinS

BrocchieriL

CampbellA

CyertM

MrazekJ

2005 Genomic and proteomic comparisons between bacterial and archaeal genomes and related comparisons with the yeast and fly genomes. Proc Natl Acad Sci U S A 102 7309 7314

41. ParmleyJL

HuynenMA

2009 Clustering of codons with rare cognate tRNAs in human genes suggests an extra level of expression regulation. PLoS Genet 5 e1000548 doi:10.1371/journal.pgen.1000548

42. MarioniJC

MasonCE

ManeSM

StephensM

GiladY

2008 RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18 1509 1517

43. NeuhauserM

SenskeR

2004 The Baumgartner-Weiss-Schindler test for the detection of differentially expressed genes in replicated microarray experiments. Bioinformatics 20 3553 3564

44. WagnerA

2000 Inferring lifestyle from gene expression patterns. Mol Biol Evol 17 1985 1987

45. BennetzenJL

HallBD

1982 Codon selection in yeast. J Biol Chem 257 3026 3031

46. ChanPP

LoweTM

2009 GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res 37 D93 97

47. RozenskiJ

CrainPF

McCloskeyJA

1999 The RNA Modification Database: 1999 update. Nucleic Acids Res 27 196 197

48. AgrisPF

2004 Decoding the genome: a modified view. Nucleic Acids Res 32 223 238

49. AgrisPF

VendeixFA

GrahamWD

2007 tRNA's wobble decoding of the genome: 40 years of modification. J Mol Biol 366 1 13

50. MeierF

SuterB

GrosjeanH

KeithG

KubliE

1985 Queuosine modification of the wobble base in tRNAHis influences ‘in vivo’ decoding properties. EMBO J 4 823 827

51. KrugerMK

PedersenS

HagervallTG

SorensenMA

1998 The modification of the wobble base of tRNAGlu modulates the translation rate of glutamic acid codons in vivo. J Mol Biol 284 621 631

52. GrosjeanH

de Crecy-LagardV

MarckC

2009 Deciphering synonymous codons in the three domains of life: Co-evolution with specific tRNA modification enzymes. FEBS Lett

53. HershbergR

PetrovDA

2009 General rules for optimal codon choice. PLoS Genet 5 e1000556 doi:10.1371/journal.pgen.1000556

54. KooninEV

WolfYI

2008 Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 36 6688 6719

55. RaneaJA

GrantA

ThorntonJM

OrengoCA

2005 Microeconomic principles explain an optimal genome size in bacteria. Trends Genet 21 21 25

56. RochaEP

2004 Codon usage bias from tRNA's point of view: redundancy, specialization, and efficient decoding for translation optimization. Genome Res 14 2279 2286

57. KarlinS

MrazekJ

MaJ

BrocchieriL

2005 Predicted highly expressed genes in archaeal genomes. Proc Natl Acad Sci U S A 102 7303 7308

58. KanayaS

YamadaY

KinouchiM

KudoY

IkemuraT

2001 Codon usage and tRNA genes in eukaryotes: correlation of codon usage diversity with translation efficiency and with CG-dinucleotide usage as assessed by multivariate analysis. J Mol Evol 53 290 298

59. IshihamaY

SchmidtT

RappsilberJ

MannM

HartlFU

2008 Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics 9 102

60. RoyH

BeckerHD

ReinboltJ

KernD

2003 When contemporary aminoacyl-tRNA synthetases invent their cognate amino acid metabolism. Proc Natl Acad Sci U S A 100 9837 9842

61. HessDC

LuW

RabinowitzJD

BotsteinD

2006 Ammonium toxicity and potassium limitation in yeast. PLoS Biol 4 e351 doi:10.1371/journal.pbio.0040351

62. SeaverLC

ImlayJA

2001 Alkyl hydroperoxide reductase is the primary scavenger of endogenous hydrogen peroxide in Escherichia coli. J Bacteriol 183 7173 7181

63. GlyakinaAV

GarbuzynskiySO

LobanovMY

GalzitskayaOV

2007 Different packing of external residues can explain differences in the thermostability of proteins from thermophilic and mesophilic organisms. Bioinformatics 23 2231 2238

64. MizuguchiK

SeleM

CubellisMV

2007 Environment specific substitution tables for thermophilic proteins. BMC Bioinformatics 8 Suppl 1 S15

65. D'AmicoS

CollinsT

MarxJC

FellerG

GerdayC

2006 Psychrophilic microorganisms: challenges for life. EMBO Rep 7 385 389

66. ftp://ftp.ncbi.nih.gov/genomes/Bacteria/

67. SelengutJD

HaftDH

DavidsenT

GanapathyA

Gwinn-GiglioM

2007 TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res 35 D260 264

68. http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi

69. FawcettT

2006 An introduction to ROC analysis. Pattern Recognition Letters 27 861 874

70. http://fast-random-forest.googlecode.com/

71. LuP

VogelC

WangR

YaoX

MarcotteEM

2007 Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol 25 117 124

72. McDonaldJH

2008 Sign test. Handbook of Biological Statistics. Baltimore Sparky House Publishing 185 189

73. NadeauC

BengioY

2003 Inference for the generalization error. Machine Learning 52 239 281

74. ChenK

RobertsE

Luthey-SchultenZ

2009 Horizontal gene transfer of zinc and non-zinc forms of bacterial ribosomal protein S4. BMC Evol Biol 9 179

75. MolinaN

van NimwegenE

2008 Universal patterns of purifying selection at noncoding positions in bacteria. Genome Res 18 148 160

76. LangilleMG

BrinkmanFS

2009 IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 25 664 665

77. McDonaldJH

2008 Fisher's exact test of independence. Handbook of Biological Statistics. Baltimore Sparky House Publishing 64 68

78. AshburnerM

BallCA

BlakeJA

BotsteinD

ButlerH

2000 Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25 25 29

79. TatusovRL

FedorovaND

JacksonJD

JacobsAR

KiryutinB

2003 The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4 41

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2010 Číslo 6
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#