Noisy Splicing Drives mRNA Isoform Diversity in Human Cells
While the majority of multiexonic human genes show some evidence of alternative splicing, it is unclear what fraction of observed splice forms is functionally relevant. In this study, we examine the extent of alternative splicing in human cells using deep RNA sequencing and de novo identification of splice junctions. We demonstrate the existence of a large class of low abundance isoforms, encompassing approximately 150,000 previously unannotated splice junctions in our data. Newly-identified splice sites show little evidence of evolutionary conservation, suggesting that the majority are due to erroneous splice site choice. We show that sequence motifs involved in the recognition of exons are enriched in the vicinity of unconserved splice sites. We estimate that the average intron has a splicing error rate of approximately 0.7% and show that introns in highly expressed genes are spliced more accurately, likely due to their shorter length. These results implicate noisy splicing as an important property of genome evolution.
Vyšlo v časopise:
Noisy Splicing Drives mRNA Isoform Diversity in Human Cells. PLoS Genet 6(12): e32767. doi:10.1371/journal.pgen.1001236
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1001236
Souhrn
While the majority of multiexonic human genes show some evidence of alternative splicing, it is unclear what fraction of observed splice forms is functionally relevant. In this study, we examine the extent of alternative splicing in human cells using deep RNA sequencing and de novo identification of splice junctions. We demonstrate the existence of a large class of low abundance isoforms, encompassing approximately 150,000 previously unannotated splice junctions in our data. Newly-identified splice sites show little evidence of evolutionary conservation, suggesting that the majority are due to erroneous splice site choice. We show that sequence motifs involved in the recognition of exons are enriched in the vicinity of unconserved splice sites. We estimate that the average intron has a splicing error rate of approximately 0.7% and show that introns in highly expressed genes are spliced more accurately, likely due to their shorter length. These results implicate noisy splicing as an important property of genome evolution.
Zdroje
1. BlackDL
2003 Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 72 291 336
2. l WangET
SandbergR
LuoS
KhrebtukovaI
ZhangL
2008 Alternative isoform regulation in human tissue transcriptomes. Nature 456 470 6
3. ZavolanM
KondoS
SchonbachC
AdachiJ
HumeDA
2003 Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome. Genome Res 13 1290 300
4. MortazaviA
WilliamsBA
McCueK
SchaefferL
WoldB
2008 Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5 621 8
5. PanQ
ShaiO
LeeLJ
FreyBJ
BlencoweBJ
2008 Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40 1413 5
6. ModrekB
LeeC
2002 A genomic view of alternative splicing. Nat Genet 30 13 9
7. HurstLD
2009 Evolutionary genomics and the reach of selection. J Biol 8 12
8. MelamudE
MoultJ
2009 Stochastic noise in splicing machinery. Nucleic Acids Res 37 4873 86
9. ZhangC
KrainerAR
ZhangMQ
2007 Evolutionary impact of limited splicing fidelity in mammalian genes. Trends Genet 23 484 8
10. BaekD
GreenP
2005 Sequence conservation, relative isoform frequencies, and nonsense-mediated decay in evolutionarily conserved alternative splicing. Proc Natl Acad Sci U S A 102 12813 8
11. SorekR
ShamirR
AstG
2004 How prevalent is functional alternative splicing in the human genome? Trends Genet 20 68 71
12. YeoGW
Van NostrandE
HolsteD
PoggioT
BurgeCB
2005 Identification and analysis of alternative splicing events conserved in human and mouse. Proc Natl Acad Sci U S A 102 2850 5
13. ModrekB
LeeCJ
2003 Alternative splicing in the human, mouse and rat genomes is associated with an increased frequency of exon creation and/or loss. Nat Genet 34 177 80
14. JaillonO
BouhoucheK
GoutJF
AuryJM
NoelB
2008 Translational control of intron splicing in eukaryotes. Nature 451 359 62
15. LynchM
2010 Rate, molecular spectrum, and consequences of human mutation. Proc Natl Acad Sci U S A 107 961 8
16. LynchM
2007 The origins of genome architecture. Mass. Sinauer Associates
17. PickrellJK
MarioniJC
PaiAA
DegnerJF
EngelhardtBE
2010 Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464 768 72
18. SultanM
SchulzMH
RichardH
MagenA
KlingenhoffA
2008 A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321 956 60
19. MarioniJC
MasonCE
ManeSM
StephensM
GiladY
2008 RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18 1509 17
20. DouY
Fox-WalshKL
BaldiPF
HertelKJ
2006 Genomic splice-site analysis reveals frequent alternative splicing close to the dominant splice site. RNA 12 2047 56
21. ChernTM
van NimwegenE
KaiC
KawaiJ
CarninciP
2006 A simple physical model predicts small exon length variations. PLoS Genet 2 e45 doi:10.1371/journal.pgen.0020045
22. HillerM
PlatzerM
2008 Widespread and subtle: alternative splicing at short-distance tandem sites. Trends Genet 24 246 55
23. HsuF
KentWJ
ClawsonH
KuhnRM
DiekhansM
2006 The UCSC Known Genes. Bioinformatics 22 1036 46
24. HubbardTJP
AkenBL
AylingS
BallesterB
BealK
2009 Ensembl 2009. Nucleic Acids Res 37 D690 7
25. PruittKD
TatusovaT
MaglottDR
2007 NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 35 D61 5
26. PruittKD
HarrowJ
HarteRA
WallinC
DiekhansM
2009 The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res 19 1316 23
27. BensonDA
Karsch-MizrachiI
LipmanDJ
OstellJ
SayersEW
2010 GenBank. Nucleic Acids Res 38 D46 51
28. MontgomerySB
SammethM
Gutierrez-ArcelusM
LachRP
IngleC
2010 Transcriptome genetics using second generation sequencing in a caucasian population. Nature 464 773 7
29. KwanT
BenovoyD
DiasC
GurdS
ProvencherC
2008 Genome-wide analysis of transcript isoform variation in humans. Nat Genet 40 225 31
30. FraserHB
XieX
2009 Common polymorphic transcript variation in human disease. Genome Res 19 567 75
31. PollardKS
HubiszMJ
RosenbloomKR
SiepelA
2010 Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res 20 110 21
32. Castillo-DavisCI
MekhedovSL
HartlDL
KooninEV
KondrashovFA
2002 Selection for short introns in highly expressed genes. Nat Genet 31 415 8
33. HurstLD
BruntonCF
SmithNG
1999 Small introns tend to occur in GC-rich regions in some but not all vertebrates. Trends Genet 15 437 9
34. YuY
MaroneyPA
DenkerJA
ZhangXHF
DybkovO
2008 Dynamic regulation of alternative splicing by silencers that modulate 5′ splice site competition. Cell 135 1224 36
35. FairbrotherWG
YehRF
SharpPA
BurgeCB
2002 Predictive identification of exonic splicing enhancers in human genes. Science 297 1007 13
36. WangZ
BurgeCB
2008 Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA 14 802 13
37. MatlinAJ
ClarkF
SmithCWJ
2005 Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol 6 386 98
38. BarashY
CalarcoJA
GaoW
PanQ
WangX
2010 Deciphering the splicing code. Nature 465 53 9
39. ZhangXHF
ChasinLA
2004 Computational definition of sequence motifs governing constitutive exon splicing. Genes Dev 18 1241 50
40. WangZ
RolishME
YeoG
TungV
MawsonM
2004 Systematic identification and analysis of exonic splicing silencers. Cell 119 831 45
41. LucoRF
PanQ
TominagaK
BlencoweBJ
Pereira-SmithOM
2010 Regulation of alternative splicing by histone modifications. Science 327 996 1000
42. SpiesN
NielsenCB
PadgettRA
BurgeCB
2009 Biased chromatin signatures around polyadenylation sites and exons. Mol Cell 36 245 54
43. SchwartzS
MeshorerE
AstG
2009 Chromatin organization marks exon-intron structure. Nat Struct Mol Biol 16 990 5
44. TilgnerH
NikolaouC
AlthammerS
SammethM
BeatoM
2009 Nucleosome positioning as a determinant of exon recognition. Nat Struct Mol Biol 16 996 1001
45. AnderssonR
EnrothS
Rada-IglesiasA
WadeliusC
KomorowskiJ
2009 Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res 19 1732 41
46. Kolasinska-ZwierzP
DownT
LatorreI
LiuT
LiuXS
2009 Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet 41 376 81
47. ParmleyJL
UrrutiaAO
PotrzebowskiL
KaessmannH
HurstLD
2007 Splicing and the evolution of proteins in mammals. PLoS Biol 5 e14 doi:10.1371/journal.pbio.0050014
48. ZhangC
LiWH
KrainerAR
ZhangMQ
2008 Rna landscape of evolution for optimal exon and intron discrimination. Proc Natl Acad Sci U S A 105 5797 802
49. RoyM
KimN
XingY
LeeC
2008 The effect of intron length on exon creation ratios during the evolution of mammalian genomes. RNA 14 2261 73
50. Fox-WalshKL
DouY
LamBJ
HungSP
BaldiPF
2005 The architecture of pre-mrnas affects mechanisms of splice-site pairing. Proc Natl Acad Sci U S A 102 16176 81
51. CarvalhoAB
ClarkAG
1999 Intron size and natural selection. Nature 401 344
52. LynchM
2002 Intron evolution as a population-genetic process. Proc Natl Acad Sci U S A 99 6118 23
53. KimE
MagenA
AstG
2007 Different levels of alternative splicing among eukaryotes. Nucleic Acids Res 35 125 31
54. WuJQ
HabeggerL
NoisaP
SzekelyA
QiuC
2010 Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proc Natl Acad Sci U S A
55. TrapnellC
PachterL
SalzbergSL
2009 Tophat: discovering splice junctions with RNA-Seq. Bioinformatics 25 1105 11
56. TrapnellC
WilliamsBA
PerteaG
MortazaviA
KwanG
2010 Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28 511 5
57. GuttmanM
GarberM
LevinJZ
DonagheyJ
RobinsonJ
2010 Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol 28 503 10
58. AmeurA
WetterbomA
FeukL
GyllenstenU
2010 Global and unbiased detection of splice junctions from RNA-seq data. Genome Biol 11 R34
59. AuKF
JiangH
LinL
XingY
WongWH
2010 Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res 38 4570 8
60. YassourM
KaplanT
FraserHB
LevinJZ
PfiffnerJ
2009 Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc Natl Acad Sci U S A 106 3264 9
61. LiH
DurbinR
2009 Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25 1754 60
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2010 Číslo 12
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Functional Comparison of Innate Immune Signaling Pathways in Primates
- Expression of Linear and Novel Circular Forms of an -Associated Non-Coding RNA Correlates with Atherosclerosis Risk
- Genome-Wide Interrogation of Mammalian Stem Cell Fate Determinants by Nested Chromosome Deletions
- Histone H2A C-Terminus Regulates Chromatin Dynamics, Remodeling, and Histone H1 Binding