#PAGE_PARAMS# #ADS_HEAD_SCRIPTS# #MICRODATA#

A General Approach for Haplotype Phasing across the Full Spectrum of Relatedness


Every individual carries two copies of each chromosome (haplotypes), one from each of their parents, that consist of a long sequence of alleles. Modern genotyping technologies do not measure haplotypes directly, but the combined sum (or genotype) of alleles at each site. Statistical methods are needed to infer (or phase) the haplotypes from the observed genotypes. Haplotype estimation is a key first step of many disease and population genetic studies. Much recent work in this area has focused on phasing in cohorts of nominally unrelated individuals. So called ‘long range phasing’ is a relatively recent concept for phasing individuals with intermediate levels of relatedness, such as cohorts taken from population isolates. Methods also exist for phasing genotypes for individuals within explicit pedigrees. Whilst high quality phasing techniques are available for each of these demographic scenarios, to date, no single method is applicable to all three. In this paper, we present a general approach for phasing cohorts that contain any level of relatedness between the study individuals. We demonstrate high levels of accuracy in all demographic scenarios, as well as the ability to detect (Mendelian consistent) genotyping error and recombination events in duos and trios, the first method with such a capability.


Vyšlo v časopise: A General Approach for Haplotype Phasing across the Full Spectrum of Relatedness. PLoS Genet 10(4): e32767. doi:10.1371/journal.pgen.1004234
Kategorie: Research Article
prolekare.web.journal.doi_sk: https://doi.org/10.1371/journal.pgen.1004234

Souhrn

Every individual carries two copies of each chromosome (haplotypes), one from each of their parents, that consist of a long sequence of alleles. Modern genotyping technologies do not measure haplotypes directly, but the combined sum (or genotype) of alleles at each site. Statistical methods are needed to infer (or phase) the haplotypes from the observed genotypes. Haplotype estimation is a key first step of many disease and population genetic studies. Much recent work in this area has focused on phasing in cohorts of nominally unrelated individuals. So called ‘long range phasing’ is a relatively recent concept for phasing individuals with intermediate levels of relatedness, such as cohorts taken from population isolates. Methods also exist for phasing genotypes for individuals within explicit pedigrees. Whilst high quality phasing techniques are available for each of these demographic scenarios, to date, no single method is applicable to all three. In this paper, we present a general approach for phasing cohorts that contain any level of relatedness between the study individuals. We demonstrate high levels of accuracy in all demographic scenarios, as well as the ability to detect (Mendelian consistent) genotyping error and recombination events in duos and trios, the first method with such a capability.


Zdroje

1. StephensM, SmithNJ, DonnellyP (2001) A new statistical method for haplotype reconstruction from population data. The American Journal of Human Genetics 68: 978–989.

2. DelaneauO, ZaguryJF, MarchiniJ (2013) Improved whole-chromosome phasing for disease and population genetic studies. Nature Methods 10: 5–6.

3. MarchiniJ, CutlerD, PattersonN, StephensM, EskinE, et al. (2006) A comparison of phasing algorithms for trios and unrelated individuals. The American Journal of Human Genetics 78: 437–450.

4. BrowningBL, BrowningSR (2009) A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. The American Journal of Human Genetics 84: 210–223.

5. DelaneauO, MarchiniJ, ZaguryJ (2011) A linear complexity phasing method for thousands of genomes. Nature Methods 9: 179–181.

6. WilliamsA, PattersonN, GlessnerJ, HakonarsonH, ReichD (2012) Phasing of many thousands of genotyped samples. The American Journal of Human Genetics 91: 238–251.

7. LangeK, PappJC, SinsheimerJS, SriprachaR, ZhouH, et al. (2013) Mendel: the Swiss army knife of genetic analysis programs. Bioinformatics 29: 1568–1570.

8. SobelE, LangeK (1996) Descent graphs in pedigree analysis: applications to haplotyping, location scores, and marker-sharing statistics. The American Journal of Human Genetics 58: 1323–1337.

9. AbecasisG, ChernyS, CooksonW, CardonL, et al. (2002) Merlin-rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics 30: 97–101.

10. GudbjartssonD, JonassonK, FriggeM, KongA (2000) Allegro, a new computer program for multipoint linkage analysis. Nature Genetics 25: 12.

11. HowieB, FuchsbergerC, StephensM, MarchiniJ, AbecasisG (2012) Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nature Genetics 44: 955–959.

12. KongA, MassonG, FriggeML, GylfasonA, ZusmanovichP, et al. (2008) Detection of sharing by descent, long-range phasing and haplotype imputation. Nature Genetics 40: 1068–1075.

13. PalinK, CampbellH, WrightAF, WilsonJF, DurbinR (2011) Identity-by-descent-based phasing and imputation in founder populations using graphical models. Genetic Epidemiology 35: 853–860.

14. AlmasyL, BlangeroJ (1998) Multipoint quantitative-trait linkage analysis in general pedigrees. The American Journal of Human Genetics 62: 1198–1211.

15. YuJ, PressoirG, BriggsWH, BiIV, YamasakiM, et al. (2005) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genetics 38: 203–208.

16. LippertC, ListgartenJ, LiuY, KadieCM, DavidsonRI, et al. (2011) Fast linear mixed models for genome-wide association studies. Nature Methods 8: 833–835.

17. GusevA, KennyE, LoweJ, SalitJ, SaxenaR, et al. (2011) DASH: a method for identical-by-descent haplotype mapping uncovers association with recent variation. The American Journal of Human Genetics 88: 706–717.

18. BrowningS, ThompsonE (2012) Detecting rare variant associations by identity-by-descent mapping in case-control studies. Genetics 190: 1521–1531.

19. Day-WilliamsA, BlangeroJ, DyerT, LangeK, SobelE (2011) Linkage analysis without defined pedigrees. Genetic Epidemiology 35: 360–370.

20. GlodzikD, NavarroP, VitartV, HaywardC, McQuillanR, et al. (2013) Inference of identity by descent in population isolates and optimal sequencing studies. European Journal of Human Genetics 21: 1140–1145 doi:10.1038/ejhg.2012.307

21. RalphP, CoopG (2013) The geography of recent genetic ancestry across Europe. PLoS Biology 11: e1001555.

22. Francesco-PalamaraP, LenczT, DarvasiA, Pe'erI (2012) Length distributions of identity by descent reveal fine-scale demographic history. The American Journal of Human Genetics 91: 809–822.

23. KristianssonK, NaukkarinenJ, PeltonenL (2008) Isolated populations and complex disease gene identification. Genome Biology 9: 109.

24. HowieB, DonnellyP, MarchiniJ (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genetics 5: e1000529.

25. LiY, WillerC, DingJ, ScheetP, AbecasisG (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genetic Epidemiology 34: 816–834.

26. ScheetP, StephensM (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. The American Journal of Human Genetics 78: 629–644.

27. McQuillanR, LeuteneggerA, Abdel-RahmanR, FranklinC, PericicM, et al. (2008) Runs of homozygosity in European populations. The American Journal of Human Genetics 83: 359–372.

28. ZemunikT, BobanM, LaucG, JankovićS, RotimK, et al. (2009) Genome-wide association study of biochemical traits in Korčula Island, Croatia. Croatian Medical Journal 50: 23–33.

29. TragliaM, SalaC, MasciulloC, CverhovaV, LoriF, et al. (2009) Heritability and demographic analyses in the large isolated population of Val Borbera suggest advantages in mapping complex traits genes. PLoS One 4: e7554.

30. EskoT, MezzavillaM, NelisM, BorelC, DebniakT, et al. (2012) Genetic characterization of northeastern Italian population isolates in the context of broader european genetic diversity. European Journal of Human Genetics 21: 659–665.

31. RudanI, MarušićA, JankovićS, RotimK, BobanM, et al. (2009) “10,001 Dalmatians:” Croatia launches its national biobank. Croatian Medical Journal 50: 4–6.

32. AsikiG, MurphyG, Nakiyingi-MiiroJ, SeeleyJ, NsubugaRN, et al. (2013) The general population cohort in rural south-western Uganda: a platform for communicable and non-communicable disease studies. International Journal of Epidemiology 42: 129–141.

33. WiggintonJE, AbecasisGR (2005) Pedstats: descriptive statistics, graphics and quality assessment for gene mapping data. Bioinformatics 21: 3445–3447.

34. StephensM, DonnellyP (2003) A comparison of Bayesian methods for haplotype reconstruction from population genotype data. The American Journal of Human Genetics 73: 1162–1169.

35. International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437: 1299–1320.

36. RabinerLR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77: 257–286.

37. KongA, GudbjartssonDF, SainzJ, JonsdottirGM, GudjonssonSA, et al. (2002) A high-resolution recombination map of the human genome. Nature Genetics 31: 241–247.

38. WilliamsAL, HousmanDE, RinardMC, GiffordDK (2010) Rapid haplotype inference for nuclear families. Genome Biology 11: R108.

39. CoopG, WenX, OberC, PritchardJK, PrzeworskiM (2008) High-resolution mapping of crossovers reveals extensive variation in fine-scale recombination patterns among humans. Science 319: 1395–1398.

40. KongA, ThorleifssonG, GudbjartssonDF, MassonG, SigurdssonA, et al. (2010) Fine-scale recombination rate differences between sexes, populations and individuals. Nature 467: 1099–1103.

41. HinchAG, TandonA, PattersonN, SongY, RohlandN, et al. (2011) The landscape of recombination in African Americans. Nature 476: 170–175.

42. O'ConnellJ, MarchiniJ (2012) Joint genotype calling with array and sequence data. Genetic Epidemiology 36: 527–537.

43. LinS, CutlerDJ, ZwickME, ChakravartiA (2002) Haplotype inference in random population samples. The American Journal of Human Genetics 71: 1129–1137.

44. BrowningS, BrowningB (2007) Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. The American Journal of Human Genetics 81: 1084–1097.

45. HayesB, VisscherP, GoddardM (2009) Increased accuracy of artificial selection by using the realized relationship matrix. Genetical Research 91: 47–60.

46. MatiseTC, ChenF, ChenW, FranciscoM, HansenM, et al. (2007) A second-generation combined linkage-physical map of the human genome. Genome Research 17: 1783–1786.

47. KongA, ThorleifssonG, FriggeML, MassonG, GudbjartssonDF, et al. (2014) Common and low frequency variants associated with genome-wide recombination rate. Nature Genetics 46: 11–16.

48. LiN, StephensM (2003) Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165: 2213–2233.

49. HickeyJM, KinghornBP, TierB, WilsonJF, DunstanN, et al. (2011) A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes. Genetics Selection Evolution 43: 12.

50. ZhuangZ, GusevA, ChoJ, Pe'erI (2012) Detecting identity by descent and homozygosity mapping in whole-exome sequencing data. PLoS ONE 7: e47618.

Štítky
Genetika Reprodukčná medicína

Článok vyšiel v časopise

PLOS Genetics


2014 Číslo 4
Najčítanejšie tento týždeň
Najčítanejšie v tomto čísle
Kurzy

Zvýšte si kvalifikáciu online z pohodlia domova

Aktuální možnosti diagnostiky a léčby litiáz
nový kurz
Autori: MUDr. Tomáš Ürge, PhD.

Všetky kurzy
Prihlásenie
Zabudnuté heslo

Zadajte e-mailovú adresu, s ktorou ste vytvárali účet. Budú Vám na ňu zasielané informácie k nastaveniu nového hesla.

Prihlásenie

Nemáte účet?  Registrujte sa

#ADS_BOTTOM_SCRIPTS#