#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

Digital Genotyping of Macrosatellites and Multicopy Genes Reveals Novel Biological Functions Associated with Copy Number Variation of Large Tandem Repeats


Here we utilize Nanostring digital assays and show their utility for estimating copy number of 186 multicopy genes and tandem repeats. By analyzing patterns of single nucleotide variation around these variants, we show that copy number variation at the vast majority of tandem repeat variations is not effectively tagged by nearby SNPs, and thus standard genome-wide association studies that focus on SNPs provide little or no information about such variants. By comparing patterns of tandem repeat copy number with variation in local gene expression and DNA methylation, we also identify extensive functional effects on local genome function. This includes an example of a non-coding macrosatellite repeat, expansion of which exerts a repressive effect on a nearby gene accompanied by accumulations of local DNA methylation. Finally, comparison of diverse human populations with a number of primate genomes shows that many of these sequences have undergone extreme changes in copy number during recent human and primate evolution, and show signatures that suggest possible selective effects. Overall, we conclude that multicopy genes and macrosatellites represent a highly variable fraction of the genome with important functional effects that has been systematically ignored by previous studies.


Vyšlo v časopise: Digital Genotyping of Macrosatellites and Multicopy Genes Reveals Novel Biological Functions Associated with Copy Number Variation of Large Tandem Repeats. PLoS Genet 10(6): e32767. doi:10.1371/journal.pgen.1004418
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004418

Souhrn

Here we utilize Nanostring digital assays and show their utility for estimating copy number of 186 multicopy genes and tandem repeats. By analyzing patterns of single nucleotide variation around these variants, we show that copy number variation at the vast majority of tandem repeat variations is not effectively tagged by nearby SNPs, and thus standard genome-wide association studies that focus on SNPs provide little or no information about such variants. By comparing patterns of tandem repeat copy number with variation in local gene expression and DNA methylation, we also identify extensive functional effects on local genome function. This includes an example of a non-coding macrosatellite repeat, expansion of which exerts a repressive effect on a nearby gene accompanied by accumulations of local DNA methylation. Finally, comparison of diverse human populations with a number of primate genomes shows that many of these sequences have undergone extreme changes in copy number during recent human and primate evolution, and show signatures that suggest possible selective effects. Overall, we conclude that multicopy genes and macrosatellites represent a highly variable fraction of the genome with important functional effects that has been systematically ignored by previous studies.


Zdroje

1. LanderES, LintonLM, BirrenB, NusbaumC, ZodyMC, et al. (2001) Initial sequencing and analysis of the human genome. Nature 409: 860–921.

2. WarburtonPE, HassonD, GuillemF, LescaleC, JinX, et al. (2008) Analysis of the largest tandemly repeated DNA families in the human genome. BMC Genomics 9: 533.

3. MillsRE, LuttigCT, LarkinsCE, BeauchampA, TsuiC, et al. (2006) An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res 16: 1182–1190.

4. SharpAJ, ItsaraA, ChengZ, AlkanC, SchwartzS, et al. (2007) Optimal design of oligonucleotide microarrays for measurement of DNA copy-number. Hum Mol Genet 16: 2770–2779.

5. AlkanC, KiddJM, Marques-BonetT, AksayG, AntonacciF, et al. (2009) Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41: 1061–1067.

6. GymrekM, GolanD, RossetS, ErlichY (2012) lobSTR: A short tandem repeat profiler for personal genomes. Genome Res 22: 1154–1162.

7. HighnamG, FranckC, MartinA, StephensC, PuthigeA, et al. (2012) Accurate human microsatellite genotypes from high-throughput resequencing data using informed error profiles. Nucleic Acids Res 41: e32.

8. EllegrenH (2000) Heterogeneous mutation processes in human microsatellite DNA sequences. Nat Genet 24: 400–402.

9. BurgnerD, RockettK, AckermanH, HullJ, UsenS, et al. (2003) Haplotypic relationship between SNP and microsatellite markers at the NOS2A locus in two populations. Genes Immun 4: 506–514.

10. SunJX, HelgasonA, MassonG, EbenesersdottirSS, LiH, et al. (2012) A direct characterization of human mutation based on microsatellites. Nature Genetics 44: 1161–1165.

11. KondrashovAS (2003) Direct estimates of human per nucleotide mutation rates at 20 loci causing Mendelian diseases. Human Mutation 21: 12–27.

12. CampbellCD, ChongJX, MaligM, KoA, DumontBL, et al. (2012) Estimating the human mutation rate using autozygosity in a founder population. Nat Genet 44: 1277–81.

13. Lopez CastelA, ClearyJD, PearsonCE (2010) Repeat instability as the basis for human diseases and as a potential target for therapy. Nature Reviews Mol Cell Biol 11: 165–170.

14. BorelC, MigliavaccaE, LetourneauA, GagnebinM, BénaF, et al. (2012) Tandem repeat sequence variation as causative cis-eQTLs for protein-coding gene expression variation: the case of CSTB. Hum Mutat 33: 1302–9.

15. HammockEA, YoungLJ (2005) Microsatellite instability generates diversity in brain and sociobehavioral traits. Science 308: 1630–1634.

16. FondonJW3rd, GarnerHR (2004) Molecular origins of rapid and continuous morphological evolution. Proc Natl Acad Sci USA 101: 18058–18063.

17. VincesMD, LegendreM, CaldaraM, HagiharaM, VerstrepenKJ (2009) Unstable tandem repeats in promoters confer transcriptional evolvability. Science 324: 1213–1216.

18. GemayelR, VincesMD, LegendreM, VerstrepenKJ (2010) Variable tandem repeats accelerate evolution of coding and regulatory sequences. Annual Rev Genet 44: 445–477.

19. HolloxEJ, HuffmeierU, ZeeuwenPL, PallaR, LascorzJ, et al. (2008) Psoriasis is associated with increased beta-defensin genomic copy number. Nat Genet 40: 23–25.

20. StuartPE, HuffmeierU, NairRP, PallaR, TejasviT, et al. (2012) Association of beta-defensin copy number and psoriasis in three cohorts of European origin. J Invest Dermatol 132: 2407–2413.

21. HardwickRJ, AmogneW, MugusiS, YimerG, NgaimisiE, et al. (2012) Beta-defensin genomic copy number is associated with HIV load and immune reconstitution in sub-saharan Africans. J Infect Dis 206: 1012–1019.

22. YangY, ChungEK, WuYL, SavelliSL, NagarajaHN, et al. (2007) Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. Am J Hum Genet 80: 1037–1054.

23. PerryGH, DominyNJ, ClawKG, LeeAS, FieglerH, et al. (2007) Diet and the evolution of human amylase gene copy number variation. Nat Genet 39: 1256–1260.

24. AldhousMC, Abu BakarS, PrescottNJ, PallaR, SooK, et al. (2010) Measurement methods and accuracy in copy number variation: failure to replicate associations of beta-defensin copy number with Crohn's disease. Hum Mol Genet 19: 4930–4938.

25. BentleyRW, PearsonJ, GearryRB, BarclayML, McKinneyC, et al. (2010) Association of higher DEFB4 genomic copy number with Crohn's disease. Am J Gastroenterol 105: 354–359.

26. BhattacharyaT, StantonJ, KimEY, KunstmanKJ, PhairJP, et al. (2009) CCL3L1 and HIV/AIDS susceptibility. Nat Med 15: 1112–1115.

27. CarpenterD, WalkerS, PrescottN, SchalkwijkJ, ArmourJA (2011) Accuracy and differential bias in copy number measurement of CCL3L1 in association studies with three auto-immune disorders. BMC Genomics 12: 418.

28. FellermannK, StangeDE, SchaeffelerE, SchmalzlH, WehkampJ, et al. (2006) A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon. Am J Hum Genet 79: 439–448.

29. FieldSF, HowsonJM, MaierLM, WalkerS, WalkerNM, et al. (2009) Experimental aspects of copy number variant assays at CCL3L1. Nat Med 15: 1115–1117.

30. GonzalezE, KulkarniH, BolivarH, ManganoA, SanchezR, et al. (2005) The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307: 1434–1440.

31. HeW, KulkarniH, CastiblancoJ, ShimizuC, AluyenU, et al. (2009) Reply to: “Experimental aspects of copy number variant assays at CCL3L1”. Nat Med 15: 1117–1120.

32. HolloxEJ (2010) Beta-defensins and Crohn's disease: confusion from counting copies. Am J Gastroenterol 105: 360–362.

33. ShresthaS, TangJ, KaslowRA (2009) Gene copy number: learning to count past two. Nat Med 15: 1127–1129.

34. UrbanTJ, WeintrobAC, FellayJ, ColomboS, ShiannaKV, et al. (2009) CCL3L1 and HIV/AIDS susceptibility. Nat Med 15: 1110–1112.

35. McLaughlinCR, ChadwickBP (2011) Characterization of DXZ4 conservation in primates implies important functional roles for CTCF binding, array expression and tandem repeat organization on the X chromosome. Genome Biol 12: R37.

36. JansenA, GemayelR, VerstrepenKJ (2010) Unstable microsatellite repeats facilitate rapid evolution of coding and regulatory sequences. Genome Dyn 7: 108–125.

37. VerstrepenKJ, JansenA, LewitterF, FinkGR (2005) Intragenic tandem repeats generate functional variability. Nat Genet 37: 986–990.

38. VincesMD, LegendreM, CaldaraM, HagiharaM, VerstrepenKJ (2009) Unstable tandem repeats in promoters confer transcriptional evolvability. Science 324: 1213–1216.

39. VerkerkAJ, PierettiM, SutcliffeJS, FuYH, KuhlDP, et al. (1991) Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell 65: 905–14.

40. StatlandJM, TawilR (2011) Facioscapulohumeral muscular dystrophy: molecular pathological advances and future directions. Curr Opin Neurol 24: 423–428.

41. van OverveldPG, LemmersRJ, SandkuijlLA, EnthovenL, WinokurST, et al. (2003) Hypomethylation of D4Z4 in 4q-linked and non-4q-linked facioscapulohumeral muscular dystrophy. Nat Genet 35: 315–317.

42. GabelliniD, GreenMR, TuplerR (2002) Inappropriate gene activation in FSHD: a repressor complex binds a chromosomal repeat deleted in dystrophic muscle. Cell 110: 339–348.

43. AssaadFF, TuckerKL, SignerER (1993) Epigenetic repeat-induced gene silencing (RIGS) in Arabidopsis. Plant Mol Biol 22: 1067–1085.

44. DorerDR, HenikoffS (1994) Expansions of transgene repeats cause heterochromatin formation and gene silencing in Drosophila. Cell 77: 993–1002.

45. GarrickD, FieringS, MartinDI, WhitelawE (1998) Repeat-induced gene silencing in mammals. Nat Genet 18: 56–59.

46. YeF, SignerER (1996) RIGS (repeat-induced gene silencing) in Arabidopsis is transcriptional and alters chromatin configuration. Proc Natl Acad Sci USA 93: 10881–10886.

47. GeissGK, BumgarnerRE, BirdittB, DahlT, DowidarN, et al. (2008) Direct multiplexed measurement of gene expression with color-coded probe pairs. Nat Biotechnol 26: 317–325.

48. BalogJ, MillerD, Sanchez-CurtaillesE, Carbo-MarquesJ, BlockG, et al. (2012) Epigenetic regulation of the X-chromosomal macrosatellite repeat encoding for the cancer/testis gene CT47. Eur J Hum Genet 20: 185–191.

49. HardwickRJ, MachadoLR, ZuccheratoLW, AntolinosS, XueY, et al. (2011) A worldwide analysis of beta-defensin copy number variation suggests recent selection of a high-expressing DEFB103 gene copy in East Asia. Hum Mutat 32: 743–750.

50. FodeP, JespersgaardC, HardwickRJ, BogleH, TheisenM, et al. (2011) Determination of beta-defensin genomic copy number in different populations: a comparison of three methods. PLoS One 6: e16768.

51. SudmantPH, KitzmanJO, AntonacciF, AlkanC, MaligM, et al. (2010) Diversity of human copy number variation and multicopy genes. Science 330: 641–646.

52. ConradDF, PintoD, RedonR, FeukL, GokcumenO, et al. (2010) Origins and functional impact of copy number variation in the human genome. Nature 464: 704–712.

53. HebbringSJ, AdjeiAA, BaerJL, JenkinsGD, ZhangJ, et al. (2007) Human SULT1A1 gene: copy number differences and functional implications. Hum Mol Genet 16: 463–70.

54. ZhangW, DuanS, BleibelWK, WiselSA, HuangRS, et al. (2009) Identification of common genetic variants that account for transcript isoform variation between human populations. Hum Genet 125: 81–93.

55. MontgomerySB, SammethM, Gutierrez-ArcelusM, LachRP, IngleC, et al. (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464: 773–777.

56. PickrellJK, MarioniJC, PaiAA, DegnerJF, EngelhardtBE, et al. (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464: 768–772.

57. MoenEL, ZhangX, MuW, DelaneySM, WingC, et al. (2013) Genome-wide variation of cytosine modifications between European and African populations and the implications for complex traits. Genetics 194: 987–96.

58. DorerDR, HenikoffS (1997) Transgene repeat arrays interact with distant heterochromatin and cause silencing in cis and trans. Genetics 147: 1181–1190.

59. HenikoffS (1998) Conspiracy of silence among repeated transgenes. Bioessays 20: 532–535.

60. VilellaAJ, SeverinJ, Ureta-VidalA, HengL, DurbinR, et al. (2009) EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19: 327–335.

61. BaileyJA, YavorAM, MassaHF, TraskBJ, EichlerEE (2001) Segmental duplications: organization and impact within the current human genome project assembly. Genome Res 11: 1005–1017.

62. TysonC, SharpAJ, HrynchakM, YongSL, HolloxEJ, et al. (2014) Expansion of a 12-kb VNTR containing the REXO1L1 gene cluster underlies the microscopically visible euchromatic variant of 8q21.2. Eur J Hum Genet 22: 458–63.

63. LohrHF, GerkenG, MichelG, BraunHB, Meyer zum BuschenfeldeKH (1994) In vitro secretion of anti-GOR protein and anti-hepatitis C virus antibodies in patients with chronic hepatitis C. Gastroenterology 107: 1443–1448.

64. MichelG, RitterA, GerkenG, Meyer zum BuschenfeldeKH, DeckerR, et al. (1992) Anti-GOR and hepatitis C virus in autoimmune liver diseases. Lancet 339: 267–269.

65. QuirogaJA, CastilloI, BartolomeJ, CarrenoV (2007) Serum immunoglobulin G antibodies to the GOR autoepitope are present in patients with occult hepatitis C virus (HCV) infection despite lack of HCV-specific antibodies. Clin Vaccine Immunol 14: 1302–1306.

66. CookDE, LeeTG, GuoX, MelitoS, WangK, BaylessAM, et al. (2012) Copy number variation of multiple genes at Rhg1 mediates nematode resistance in soybean. Science 338: 1206–1209.

67. EichlerEE, FlintJ, GibsonG, KongA, LealSM, et al. (2010) Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 11: 446–450.

68. ManolioTA, CollinsFS, CoxNJ, GoldsteinDB, HindorffLA, et al. (2009) Finding the missing heritability of complex diseases. Nature 461: 747–753.

69. TessereauC, BuissonM, MonnetN, ImbertM, BarjhouxL, et al. (2013) Direct visualization of the highly polymorphic RNU2 locus in proximity to the BRCA1 gene. PLoS One 8: e76054.

70. Zhou S, Herscheleb J, Schwartz DC (2007) A single molecule system for whole genome analysis. New high throughput technologies for DNA sequencing and genomics 2. Elsevier. 269–304 p.

71. JohnsonME, ViggianoL, BaileyJA, Abdul-RaufM, GoodwinG, et al. (2001) Positive selection of a gene family during the emergence of humans and African apes. Nature 413: 514–519.

72. PopescoMC, MaclarenEJ, HopkinsJ, DumasL, CoxM, et al. (2006) Human lineage-specific amplification, selection, and neuronal expression of DUF1220 domains. Science 313: 1304–1307.

73. ZhaoQ, CaballeroOL, SimpsonAJ, StrausbergRL (2012) Differential evolution of MAGE genes based on expression pattern and selection pressure. PLoS One 7: e48240.

74. SamonteRV, EichlerEE (2002) Segmental duplications and the evolution of the primate genome. Nat Rev Genet 3: 65–72.

75. NeiM, ZhangJ, YokoyamaS (1997) Color vision of ancestral organisms of higher primates. Mol Biol Evol 14: 611–618.

76. YokoyamaS, RadlwimmerFB (1999) The molecular genetics of red and green color vision in mammals. Genetics 153: 919–932.

77. JaglaWM, JagleH, HayashiT, SharpeLT, DeebSS (2002) The molecular basis of dichromatic color vision in males with multiple red and green visual pigment genes. Hum Mol Genet 11: 23–32.

78. NeitzJ, NeitzM, KainzPM (1996) Visual pigment gene structure and the severity of color vision defects. Science 274: 801–804.

79. DuP, KibbeWA, LinSM (2008) lumi: a pipeline for processing Illumina microarray. Bioinformatics 24: 1547–1548.

80. Davis S, Du P, Bilke S, Triche T, Bootwalla M (2012) methylumi: Handle Illumina methylation data. R package version 2.4.0. http://www.bioconductor.org/packages/2.14/bioc/html/methylumi.html

81. TeschendorffAE, MarabitaF, LechnerM, BartlettT, TegnerJ, et al. (2013) A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 29: 189–96.

82. HassonD, AlonsoA, CheungF, TepperbergJH, PapenhausenPR, et al. (2011) Formation of novel CENP-A domains on tandem repetitive DNA and across chromosome breakpoints on human chromosome 8q21 neocentromeres. Chromosoma 120: 621–632.

83. AlonsoA, HassonD, CheungF, WarburtonPE (2010) A paucity of heterochromatin at functional human neocentromeres. Epigenetics Chromatin 3: 6.

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2014 Číslo 6
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#