U87MG Decoded: The Genomic Sequence of a Cytogenetically Aberrant Human Cancer Cell Line
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30× genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.
Vyšlo v časopise:
U87MG Decoded: The Genomic Sequence of a Cytogenetically Aberrant Human Cancer Cell Line. PLoS Genet 6(1): e32767. doi:10.1371/journal.pgen.1000832
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1000832
Souhrn
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30× genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.
Zdroje
1. LeeY
ScheckAC
CloughesyTF
LaiA
DongJ
2008 Gene expression analysis of glioblastomas identifies the major molecular basis for the prognostic benefit of younger age. BMC Med Genomics 1 52
2. 2008 CBTRUS 2007–2008 Statistical Report: Primary Brain Tumors in the United States Statistical Report, 2000–2004 (Years of Data Collected). CBTRUS
3. StuppR
MasonWP
van den BentMJ
WellerM
FisherB
2005 Radiotherapy plus concomitant and adjuvant temozolomide for glioblastoma. N Engl J Med 352 987 996
4. 2008 Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455 1061 1068
5. PontenJ
MacintyreEH
1968 Long term culture of normal and neoplastic human glia. Acta Pathol Microbiol Scand 74 465 486
6. SquireJA
ArabS
MarranoP
BayaniJ
KaraskovaJ
2001 Molecular cytogenetic analysis of glial tumors using spectral karyotyping and comparative genomic hybridization. Mol Diagn 6 93 108
7. LawME
TempletonKL
KitangeG
SmithJ
MisraA
2005 Molecular cytogenetic analysis of chromosomes 1 and 19 in glioma cell lines. Cancer Genet Cytogenet 160 1 14
8. BamfordS
DawsonE
ForbesS
ClementsJ
PettettR
2004 The COSMIC (Catalogue of Somatic Mutations in Cancer) database and website. Br J Cancer 91 355 358
9. LanderES
LintonLM
BirrenB
NusbaumC
ZodyMC
2001 Initial sequencing and analysis of the human genome. Nature 409 860 921
10. VenterJC
AdamsMD
MyersEW
LiPW
MuralRJ
2001 The sequence of the human genome. Science 291 1304 1351
11. LevyS
SuttonG
NgPC
FeukL
HalpernAL
2007 The diploid genome sequence of an individual human. PLoS Biol 5 e254 doi:10.1371/journal.pbio.0050254
12. WheelerDA
SrinivasanM
EgholmM
ShenY
ChenL
2008 The complete genome of an individual by massively parallel DNA sequencing. Nature 452 872 876
13. LeyTJ
MardisER
DingL
FultonB
McLellanMD
2008 DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456 66 72
14. WangJ
WangW
LiR
LiY
TianG
2008 The diploid genome sequence of an Asian individual. Nature 456 60 65
15. KimJI
JuYS
ParkH
KimS
LeeS
2009 A highly annotated whole-genome sequence of a Korean individual. Nature 460 1011 1015
16. AhnSM
KimTH
LeeS
KimD
GhangH
2009 The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. Genome Res 19 1622 1629
17. McKernanKJ
PeckhamHE
CostaGL
McLaughlinSF
FuY
2009 Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res 19 1527 1541
18. BentleyDR
BalasubramanianS
SwerdlowHP
SmithGP
MiltonJ
2008 Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456 53 59
19. HomerN
MerrimanB
NelsonSF
2009 BFAST: an alignment tool for large scale genome resequencing. PLoS ONE 4 e7767 doi:10.1371/journal.pone.0007767
20. LiH
RuanJ
DurbinR
2008 Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18 1851 1858
21. LiH
HandsakerB
WysokerA
FennellT
RuanJ
2009 The Sequence Alignment/Map format and SAMtools. Bioinformatics 25 2078 2079
22. SherryST
WardMH
KholodovM
BakerJ
PhanL
2001 dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29 308 311
23. GuX
LiWH
1995 The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment. J Mol Evol 40 464 473
24. YamaneK
YanoK
KawaharaT
2006 Pattern and rate of indel evolution inferred from whole chloroplast intergenic regions in sugarcane, maize and rice. DNA Res 13 197 204
25. TaylorMS
PontingCP
CopleyRR
2004 Occurrence and consequences of coding sequence insertions and deletions in Mammalian genomes. Genome Res 14 555 566
26. MillsRE
LuttigCT
LarkinsCE
BeauchampA
TsuiC
2006 An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res 16 1182 1190
27. CollinsDW
JukesTH
1994 Rates of transition and transversion in coding sequences since the human-rodent divergence. Genomics 20 386 396
28. BeroukhimR
GetzG
NghiemphuL
BarretinaJ
HsuehT
2007 Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc Natl Acad Sci U S A 104 20007 20012
29. 2005 Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437 69 87
30. LeeH
O'ConnorBD
MerrimanB
FunariVA
HomerN
2009 Improving the efficiency of genomic loci capture using oligonucleotide arrays for high throughput resequencing. BMC Genomics (in press)
31. HomerN
MerrimanB
NelsonSF
2009 Local alignment of two-base encoded DNA sequence. BMC Bioinformatics 10 175
32. Huang daW
ShermanBT
LempickiRA
2009 Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4 44 57
33. DennisGJr
ShermanBT
HosackDA
YangJ
GaoW
2003 DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 4 P3
34. FutrealPA
CoinL
MarshallM
DownT
HubbardT
2004 A census of human cancer genes. Nat Rev Cancer 4 177 183
35. HupeP
StranskyN
ThieryJP
RadvanyiF
BarillotE
2004 Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics 20 3413 3422
36. KrzywinskiM
ScheinJ
BirolI
ConnorsJ
GascoyneR
2009 Circos: an information aesthetic for comparative genomics. Genome Res 19 1639 1645
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2010 Číslo 1
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- A Major Role of the RecFOR Pathway in DNA Double-Strand-Break Repair through ESDSA in
- Kidney Development in the Absence of and Requires
- The Werner Syndrome Protein Functions Upstream of ATR and ATM in Response to DNA Replication Inhibition and Double-Strand DNA Breaks
- Alternative Epigenetic Chromatin States of Polycomb Target Genes