Systematic Detection of Epistatic Interactions Based on Allele Pair Frequencies
Epistatic genetic interactions are key for understanding the genetic contribution to complex traits. Epistasis is always defined with respect to some trait such as growth rate or fitness. Whereas most existing epistasis screens explicitly test for a trait, it is also possible to implicitly test for fitness traits by searching for the over- or under-representation of allele pairs in a given population. Such analysis of imbalanced allele pair frequencies of distant loci has not been exploited yet on a genome-wide scale, mostly due to statistical difficulties such as the multiple testing problem. We propose a new approach called Imbalanced Allele Pair frequencies (ImAP) for inferring epistatic interactions that is exclusively based on DNA sequence information. Our approach is based on genome-wide SNP data sampled from a population with known family structure. We make use of genotype information of parent-child trios and inspect 3×3 contingency tables for detecting pairs of alleles from different genomic positions that are over- or under-represented in the population. We also developed a simulation setup which mimics the pedigree structure by simultaneously assuming independence of the markers. When applied to mouse SNP data, our method detected 168 imbalanced allele pairs, which is substantially more than in simulations assuming no interactions. We could validate a significant number of the interactions with external data, and we found that interacting loci are enriched for genes involved in developmental processes.
Vyšlo v časopise:
Systematic Detection of Epistatic Interactions Based on Allele Pair Frequencies. PLoS Genet 8(2): e32767. doi:10.1371/journal.pgen.1002463
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002463
Souhrn
Epistatic genetic interactions are key for understanding the genetic contribution to complex traits. Epistasis is always defined with respect to some trait such as growth rate or fitness. Whereas most existing epistasis screens explicitly test for a trait, it is also possible to implicitly test for fitness traits by searching for the over- or under-representation of allele pairs in a given population. Such analysis of imbalanced allele pair frequencies of distant loci has not been exploited yet on a genome-wide scale, mostly due to statistical difficulties such as the multiple testing problem. We propose a new approach called Imbalanced Allele Pair frequencies (ImAP) for inferring epistatic interactions that is exclusively based on DNA sequence information. Our approach is based on genome-wide SNP data sampled from a population with known family structure. We make use of genotype information of parent-child trios and inspect 3×3 contingency tables for detecting pairs of alleles from different genomic positions that are over- or under-represented in the population. We also developed a simulation setup which mimics the pedigree structure by simultaneously assuming independence of the markers. When applied to mouse SNP data, our method detected 168 imbalanced allele pairs, which is substantially more than in simulations assuming no interactions. We could validate a significant number of the interactions with external data, and we found that interacting loci are enriched for genes involved in developmental processes.
Zdroje
1. CordellHJ 2002 Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Human molecular genetics 11 2463
2. KelleyRIdekerT 2005 Systematic interpretation of genetic interactions using protein networks. Nature Biotechnology 23 561 566
3. BeyerABandyopadhyaySIdekerT 2007 Integrating physical and genetic maps: from genomes to interaction networks. Nature Reviews Genetics 8 699 710
4. HohJOttJ 2003 Mathematical multi-locus approaches to localizing complex human trait genes. Nature Reviews Genetics 4 701 709
5. MarchiniJDonnellyPCardonLR 2005 Genome-wide strategies for detecting multiple loci that inuence complex diseases. Nature Genetics 37 413 417
6. PhillipsPC 2008 Epistasis - the essential role of gene interactions in the structure and evolution of genetic systems. Nature Reviews Genetics 9 855 867
7. CordellHJ 2009 Detecting gene-gene interactions that underlie human diseases. Nature Reviews Genetics 10 392 404
8. AnPMukherjeeOChandaPYaoLEngelmanCD 2009 The challenge of detecting epistasis (GxG interactions): Genetic analysis workshop 16. Genetic Epidemiology 33 S58 S67
9. LiuTThalamuthuALiuJChenCWangZ 2011 Asymptotic distribution for epistatic tests in casecontrol studies. Genomics 98 145 151
10. WangZLiuTLinZHegartyJKoltunWA 2010 A general model for multilocus epistatic interactions in Case-Control studies. PLoS ONE 5 e11384 doi:10.1371/journal.pone.0011384
11. BeltraoPCagneyGKroganNJ 2010 Quantitative genetic interactions reveal biological modularity. Cell 141 739 745
12. CostanzoMBaryshnikovaABellayJKimYSpearED 2010 The genetic landscape of a cell. Science 327 425 431
13. SchuldinerMCollinsSRThompsonNJDenicVBhamidipatiA 2005 Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell 123 507 519
14. TongAHEvangelistaMParsonsABXuHBaderGD 2001 Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294 2364 2368
15. OrrHA 1996 Dobzhansky, Bateson, and the genetics of speciation. Genetics 144 1331 1335
16. BombliesKWeigelD 2007 Hybrid necrosis: autoimmunity as a potential gene-ow barrier in plant species. Nature Reviews Genetics 8 382 393
17. MontagutelliXTurnerRNadeauJH 1996 Epistatic control of non-Mendelian inheritance in mouse interspecific crosses. Genetics 143 1739
18. PayseurBAPlaceM 2007 Searching the genomes of inbred mouse strains for incompatibilities that reproductively isolate their wild relatives. Journal of Heredity 98 115 122
19. WilliamsRWGuJQiSLuL 2001 The genetic structure of recombinant inbred mice: highresolution consensus maps for complex trait analysis. Genome Biol 2 10046
20. LawrenceRDay-WilliamsAGMottRBroxholmeJCardonLR 2009 GLIDERS–a web-based search engine for genome-wide linkage disequilibrium between HapMap SNPs. BMC Bioinformatics 10 367
21. GriffithsA 2000 An introduction to genetic analysis New York W.H. Freeman, seventh edition
22. ShifmanSBellJTCopleyRRTaylorMSWilliamsRW 2006 A High-Resolution single nucleotide polymorphism genetic map of the mouse genome. PLoS Biol 4 e395 doi:10.1371/journal.pbio.0040395
23. McLeanJRMerrillCJPowersPAGanetzkyB 1994 Functional identifucation of the segregation distorter locus of Drosophila melanogaster by germline transformation. Genetics 137 201 209
24. BenjaminiYHochbergY 1995 Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Statist Soc 57 289 300
25. AshburnerMBallCABlakeJABotsteinDButlerH 2000 Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics 25 25 29
26. AlexaARahnenführerJLengauerT 2006 Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22 1600 1607
27. RoguevABandyopadhyaySZofallMZhangKFischerT 2008 Conservation and rewiring of functional modules revealed by an epistasis map in fission yeast. Science 322 405 410
28. YePPeyserBDPanXBoekeJDSpencerFA 2005 Gene function prediction from congruent synthetic lethal interactions in yeast. Molecular Systems Biology 1 E1 E12
29. SuthramSBeyerAKarpRMEldarYIdekerT 2008 eQED: an efficient method for interpreting eQTL associations using protein networks. Molecular Systems Biology 4 162
30. LageKKarlbergEOStrlingZMOlasonPIPedersenAG 2007 A human phenomeinteractome network of protein complexes implicated in genetic disorders. Nature Biotechnology 25 309 316
31. LeeSDudleyAMDrubinDSilverPAKroganNJ 2009 Learning a prior on regulatory potential from eQTL data. PLoS Genet 5 e1000358 doi:10.1371/journal.pgen.1000358
32. MehrabianMAllayeeHStocktonJLumPYDrakeTA 2005 Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits. Nature Genetics 37 1224 1233
33. MaedaYDaveVWhitsettJA 2007 Transcriptional control of lung morphogenesis. Physiological Reviews 87 219 244
34. KudoYGuardavaccaroDSantamariaPGKoyama-NasuRLatresE 2004 Role of f-box protein betaTrcp1 in mammary gland development and tumorigenesis. Molecular and Cellular Biology 24 8184 8194
35. PedchenkoVKImagawaW 2000 Pattern of expression of the KGF receptor and its ligands KGF and FGF-10 during postnatal mouse mammary gland development. Molecular Reproduction and Development 56 441 447
36. MiletichICobourneMTAbdeenMSharpePT 2005 Expression of the hedgehog antagonists rab23 and Slimb/betaTrCP during mouse tooth development. Archives of Oral Biology 50 147 151
37. PispaJJungHSJernvallJKettunenPMustonenT 1999 Cusp patterning defect in tabby mouse teeth and its partial rescue by FGF. Developmental Biology 216 521 534
38. GulacsiA 2006 Shh maintains nkx2.1 in the MGE by a Gli3-Independent mechanism. Cerebral Cortex 16 i89 i95
39. HébertJM 2005 Unraveling the molecular pathways that regulate early telencephalon development. Current Topics in Developmental Biology 69 17 37
40. SakiyamaJi 2003 Tbx4-Fgf10 system controls lung bud formation during chicken embryonic development. Development 130 1225 1234
41. MinooP 1999 Defects in tracheoesophageal and lung morphogenesis inNkx2.1(/) mouse embryos. Developmental Biology 209 60 71
42. SpielmanRSMcGinnisREEwensWJ 1993 Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). American Journal of Human Genetics 52 506 516
43. CordellHJBarrattBJClaytonDG 2004 Case/pseudocontrol analysis in genetic association studies: A unified framework for detection of genotype and haplotype associations, gene-gene and gene-environment interactions, and parent-of-origin effects. Genetic Epidemiology 26 167 185
44. AgrestiA 2002 Categorical data analysis New York Wiley-Interscience, second edition
45. ZhengGJooJYangY 2009 Pearson's test, trend test, and MAX are all trend tests with different types of scores. Annals of Human Genetics 73 133 140
46. CheslerEJMillerDRBranstetterLRGallowayLDJacksonBL 2008 The collaborative cross at oak ridge national laboratory: developing a powerful resource for systems genetics. Mammalian Genome 19 382 389
47. MottRTalbotCJTurriMGCollinsACFlintJ 2000 A method for fine mapping quantitative trait loci in outbred animal stocks. Proceedings of the National Academy of Sciences of the United States of America 97 12649 12654
48. StearnsSCByarsSGGovindarajuDREwbankD 2010 Measuring selection in contemporary human populations. Nature Reviews Genetics 11 611 622
49. KosovaGAbneyMOberC 2010 Colloquium papers: Heritability of reproductive fitness traits in a human population. PNAS 107 Suppl 1 1772 1778
50. KeaneTMGoodstadtLDanecekPWhiteMAWongK 2011 Mouse genomic variation and its effect on phenotypes and gene regulation. Nature 477 289 294
51. de la ChapelleA 1993 Disease gene mapping in isolated human populations: the example of Finland. Journal of Medical Genetics 30 857 865
52. LiHDurbinR 2011 Inference of human population history from individual whole-genome sequences. Nature 475 493 496
53. GutenkunstRNHernandezRDWilliamsonSHBustamanteCD 2009 Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet 5 e1000695 doi:10.1371/journal.pgen.1000695
54. AndersonJBFuntJThompsonDAPrabhuSSochaA 2010 Determinants of divergent adaptation and Dobzhansky-Muller interaction in experimental yeast populations. Current Biology 20 1383 1388
55. AbecasisGRAltshulerDLAutonABrooksLD 1000 Genomes Project Consortium RMD 2010 A map of human genome variation from population-scale sequencing. Nature 467 1061 1073
56. RoachJCGlusmanGSmitAFAHuffCDHubleyR 2010 Analysis of genetic inheritance in a family quartet by Whole-Genome sequencing. Science 328 636 639
57. LiQLouisTAFallinMDRuczinskiI 2009 Trio logic regression - detection of SNP - SNP interactions in case-parent trios. Working paper 194, Johns Hopkins University, Dept. of Biostatistics. URL http://www.bepress.com/jhubiostat/paper194
58. SchwenderHLiQ 2011 trio: Detection of disease-associated SNP interactions in case-parent trio data. URL http://CRAN.R-project.org/package=trio. R package version 1.1.17
59. ScheetPStephensM 2006 A fast and exible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. The American Journal of Human Genetics 78 629644
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2012 Číslo 2
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Gene Expression and Stress Response Mediated by the Epigenetic Regulation of a Transposable Element Small RNA
- Contrasting Properties of Gene-Specific Regulatory, Coding, and Copy Number Mutations in : Frequency, Effects, and Dominance
- Homeobox Genes Critically Regulate Embryo Implantation by Controlling Paracrine Signaling between Uterine Stroma and Epithelium
- Nondisjunction of a Single Chromosome Leads to Breakage and Activation of DNA Damage Checkpoint in G2