Hominoid-Specific Protein-Coding Genes Originating from Long Non-Coding RNAs
Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA–Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis), which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.
Vyšlo v časopise:
Hominoid-Specific Protein-Coding Genes Originating from Long Non-Coding RNAs. PLoS Genet 8(9): e32767. doi:10.1371/journal.pgen.1002942
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002942
Souhrn
Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA–Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis), which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.
Zdroje
1. Susumu O (1970) Evolution by gene duplication. Springer-Verlag ISBN 0-04-575015-7.
2. JacobF (1977) Evolution and tinkering. Science 196: 1161–1166.
3. LongM, BetranE, ThorntonK, WangW (2003) The origin of new genes: glimpses from the young and old. Nat Rev Genet 4: 865–875.
4. SiepelA (2009) Darwinian alchemy: Human genes from noncoding DNA. Genome Res 19: 1693–1695.
5. WuD-D, IrwinDM, ZhangY-P (2011) De Novo Origin of Human Protein-Coding Genes. PLoS Genet 7: e1002379 doi:10.1371/journal.pgen.1002379.
6. Toll-RieraM, BoschN, BelloraN, CasteloR, ArmengolL, et al. (2008) Origin of primate orphan genes: a comparative genomics approach. Mol Biol Evol
7. KnowlesDG, McLysaghtA (2009) Recent de novo origin of human protein-coding genes. Genome Res 19: 1752–1759.
8. LiCY, ZhangY, WangZ, ZhangY, CaoC, et al. (2010) A human-specific de novo protein-coding gene associated with human brain functions. PLoS Comput Biol 6: e1000734 doi:10.1371/journal.pcbi.1000734.
9. BegunDJ, LindforsHA, KernAD, JonesCD (2007) Evidence for de novo evolution of testis-expressed genes in the Drosophila yakuba/Drosophila erecta clade. Genetics 176: 1131–1137.
10. CaiJ, ZhaoR, JiangH, WangW (2008) De novo origination of a new protein-coding gene in Saccharomyces cerevisiae. Genetics 179: 487–496.
11. ChenST, ChengHC, BarbashDA, YangHP (2007) Evolution of hydra, a recently evolved testis-expressed gene with nine alternative first exons in Drosophila melanogaster. PLoS Genet 3: e107 doi:10.1371/journal.pgen.0030107.
12. LevineMT, JonesCD, KernAD, LindforsHA, BegunDJ (2006) Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression. Proc Natl Acad Sci U S A 103: 9935–9939.
13. KaessmannH (2010) Origins, evolution, and phenotypic impact of new genes. Genome Res 20: 1313–1326.
14. TautzD, Domazet-LosoT (2011) The evolutionary origin of orphan genes. Nat Rev Genet 12: 692–702.
15. KhalturinK, HemmrichG, FrauneS, AugustinR, BoschTC (2009) More than just orphans: are taxonomically-restricted genes important in evolution? Trends Genet 25: 404–413.
16. BegunDJ, LindforsHA, ThompsonME, HollowayAK (2006) Recently evolved genes identified from Drosophila yakuba and D. erecta accessory gland expressed sequence tags. Genetics 172: 1675–1681.
17. BirneyE, StamatoyannopoulosJA, DuttaA, GuigoR, GingerasTR, et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.
18. Bornberg-BauerE, HuylmansAK, SikosekT (2010) How do new proteins arise? Curr Opin Struct Biol 20: 390–396.
19. SchusterSC (2008) Next-generation sequencing transforms today's biology. Nat Methods 5: 16–18.
20. BrawandD, SoumillonM, NecsuleaA, JulienP, CsardiG, et al. (2011) The evolution of gene expression levels in mammalian organs. Nature 478: 343–348.
21. WangET, SandbergR, LuoS, KhrebtukovaI, ZhangL, et al. (2008) Alternative isoform regulation in human tissue transcriptomes. Nature 456: 470–476.
22. BlekhmanR, MarioniJC, ZumboP, StephensM, GiladY (2010) Sex-specific and lineage-specific alternative splicing in primates. Genome Res 20: 180–189.
23. ZhangYE, LandbackP, VibranovskiMD, LongM (2011) Accelerated recruitment of new brain development genes into the human genome. PLoS Biol 9: e1001179 doi:10.1371/journal.pbio.1001179.
24. GuerzoniD, McLysaghtA (2011) De Novo Origins of Human Genes. PLoS Genet 7: e1002381 doi:10.1371/journal.pgen.1002381.
25. KimuraM (1977) Preponderance of synonymous changes as evidence for the neutral theory of molecular evolution. Nature 267: 275–276.
26. LibradoP, RozasJ (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25: 1451–1452.
27. ZhouQ, WangW (2008) On the origin and evolution of new genes–a genomic and experimental perspective. J Genet Genomics 35: 639–648.
28. VilellaAJ, SeverinJ, Ureta-VidalA, HengL, DurbinR, et al. (2009) EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Research 19: 327–335.
29. VinckenboschN, DupanloupI, KaessmannH (2006) Evolutionary fate of retroposed gene copies in the human genome. Proc Natl Acad Sci U S A 103: 3220–3225.
30. MercerTR, DingerME, MattickJS (2009) Long non-coding RNAs: insights into functions. Nat Rev Genet 10: 155–159.
31. ZhouQ, ZhangG, ZhangY, XuS, ZhaoR, et al. (2008) On the origin of new genes in Drosophila. Genome Res 18: 1446–1455.
32. ZhangYE, VibranovskiMD, LandbackP, MaraisGA, LongM (2010) Chromosomal redistribution of male-biased genes in mammalian evolution with two bursts of gene gain on the X chromosome. PLoS Biol 8 doi:10.1371/journal.pbio.1000494.
33. FujitaPA, RheadB, ZweigAS, HinrichsAS, KarolchikD, et al. (2011) The UCSC Genome Browser database: update 2011. Nucleic Acids Res 39: D876–882.
34. SlaterGS, BirneyE (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6: 31.
35. FlicekP, AmodeMR, BarrellD, BealK, BrentS, et al. (2011) Ensembl 2011. Nucleic Acids Res 39: D800–806.
36. StajichJE, BlockD, BoulezK, BrennerSE, ChervitzSA, et al. (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Res 12: 1611–1618.
37. StabenauA, McVickerG, MelsoppC, ProctorG, ClampM, et al. (2004) The Ensembl core software libraries. Genome Res 14: 929–933.
38. VizcainoJA, CoteR, ReisingerF, BarsnesH, FosterJM, et al. (2010) The Proteomics Identifications database: 2010 update. Nucleic Acids Res 38: D736–742.
39. DeutschEW (2010) The PeptideAtlas Project. Methods Mol Biol 604: 285–296.
40. The-UniProt-Consortium (2011) Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res 39: D214–219.
41. ParkhomchukD, BorodinaT, AmstislavskiyV, BanaruM, HallenL, et al. (2009) Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res 37: e123.
42. TrapnellC, PachterL, SalzbergSL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105–1111.
43. MortazaviA, WilliamsBA, McCueK, SchaefferL, WoldB (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5: 621–628.
44. LiH, HandsakerB, WysokerA, FennellT, RuanJ, et al. (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078–2079.
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2012 Číslo 9
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Enrichment of HP1a on Drosophila Chromosome 4 Genes Creates an Alternate Chromatin Structure Critical for Regulation in this Heterochromatic Domain
- Normal DNA Methylation Dynamics in DICER1-Deficient Mouse Embryonic Stem Cells
- The NDR Kinase Scaffold HYM1/MO25 Is Essential for MAK2 MAP Kinase Signaling in
- Functional Variants in and Involved in Activation of the NF-κB Pathway Are Associated with Rheumatoid Arthritis in Japanese