Population Structure of Hispanics in the United States: The Multi-Ethnic Study of Atherosclerosis
Using ∼60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.
Vyšlo v časopise:
Population Structure of Hispanics in the United States: The Multi-Ethnic Study of Atherosclerosis. PLoS Genet 8(4): e32767. doi:10.1371/journal.pgen.1002640
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002640
Souhrn
Using ∼60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.
Zdroje
1. BrycKVelezCKarafetTMoreno-EstradaAReynoldsA 2010 Colloquium paper: genome-wide patterns of population structure and admixture among Hispanic/Latino populations. Proc Natl Acad Sci U S A 107 Suppl 2 8954 8961
2. WangZHildesheimAWangSSHerreroRGonzalezP 2010 Genetic admixture and population substructure in Guanacaste Costa Rica. PLoS One 5 e13336
3. PeraltaCALiYWasselCChoudhrySPalmasW 2010 Differences in Albuminuria between Hispanics and Whites: An Evaluation by Genetic Ancestry and Country of Origin: The Multi-Ethnic Study of Atherosclerosis. Circ Cardiovasc Genet
4. WangSRayNRojasWParraMVBedoyaG 2008 Geographic patterns of genome admixture in Latin American Mestizos. PLoS Genet 4 e1000037
5. PriceALPattersonNJPlengeRMWeinblattMEShadickNA 2006 Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38 904 909
6. PattersonNPriceALReichD 2006 Population structure and eigenanalysis. PLoS Genet 2 e190
7. AlexanderDHNovembreJLangeK 2009 Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19 1655 1664
8. PritchardJKStephensMDonnellyP 2000 Inference of population structure using multilocus genotype data. Genetics 155 945 959
9. AltshulerDMGibbsRAPeltonenLDermitzakisESchaffnerSF 2010 Integrating common and rare genetic variation in diverse human populations. Nature 467 52 58
10. Cavalli-SforzaLL 2005 The Human Genome Diversity Project: past, present and future. Nat Rev Genet 6 333 340
11. LiJZAbsherDMTangHSouthwickAMCastoAM 2008 Worldwide human relationships inferred from genome-wide patterns of variation. Science 319 1100 1104
12. RouseI 1992 The Tainos: rise & decline of the people who greeted Columbus: Yale Univ Pr
13. SalzanoFMBlackFLCallegari-JacquesSMSantosSEWeimerTA 1988 Genetic variation within a linguistic group: Apalai-Wayana and other Carib tribes. Am J Phys Anthropol 75 347 356
14. TeslovichTMMusunuruKSmithAVEdmondsonACStylianouIM 2010 Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466 707 713
15. KathiresanSWillerCJPelosoGMDemissieSMusunuruK 2009 Common variants at 30 loci contribute to polygenic dyslipidemia. Nat Genet 41 56 65
16. SabattiCServiceSKHartikainenALPoutaARipattiS 2009 Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet 41 35 46
17. AulchenkoYSRipattiSLindqvistIBoomsmaDHeidIM 2009 Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts. Nat Genet 41 47 55
18. Weissglas-VolkovDAguilar-SalinasCASinsheimerJSRibaLHuertas-VazquezA 2010 Investigation of variants identified in caucasian genome-wide association studies for plasma high-density lipoprotein cholesterol and triglycerides levels in Mexican dyslipidemic study samples. Circ Cardiovasc Genet 3 31 38
19. ChenMHYangQ 2010 GWAF: an R package for genome-wide association analyses with family data. Bioinformatics 26 580 581
20. WillerCJLiYAbecasisGR 2010 METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26 2190 2191
21. Roriz-CruzMRossetIBarreto-RorizRMancilha-CarvalhoJJ 2010 Acculturation, obesity, and hypertension among female Brazilian Indians. Hypertension 56 e43 44
22. PavanLCasigliaEBragaLMWinnickiMPuatoM 1999 Effects of a traditional lifestyle on the cardiovascular risk profile: the Amondava population of the Brazilian Amazon. Comparison with matched African, Italian and Polish populations. J Hypertens 17 749 756
23. TavaresEFVieira-FilhoJPAndrioloASanudoAGimenoSG 2003 Metabolic profile and cardiovascular risk patterns of an Indian tribe living in the Amazon Region of Brazil. Hum Biol 75 31 46
24. MeyerfreundDGoncalvesCCunhaRPereiraACKriegerJE 2009 Age-dependent increase in blood pressure in two different Native American communities in Brazil. J Hypertens 27 1753 1760
25. DayECLiYDiez-RouxAKandulaNMoranA 2011 Associations of acculturation and kidney dysfunction among Hispanics and Chinese from the Multi-Ethnic Study of Atherosclerosis (MESA). Nephrol Dial Transplant 26 1909 1916
26. United States Census Bureau 2011 Table 1. The Hispanic population 2010. http://www.census.gov/prod/cen2010/briefs/c2010br-04.pdf
27. BildDEBluemkeDABurkeGLDetranoRDiez RouxAV 2002 Multi-ethnic study of atherosclerosis: objectives and design. Am J Epidemiol 156 871 881
28. PurcellSNealeBTodd-BrownKThomasLFerreiraMA 2007 PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81 559 575
29. ManichaikulAMychaleckyjJCRichSSDalyKSaleM 2010 Robust relationship inference in genome-wide association studies. Bioinformatics 26 2867 2873
30. PriceALWealeMEPattersonNMyersSRNeedAC 2008 Long-range LD can confound genome scans in admixed populations. Am J Hum Genet 83 132 135; author reply 135–139
31. ChenWMAbecasisGR 2007 Family-based association tests for genomewide association scans. Am J Hum Genet 81 913 926
32. ChenWMManichaikulARichSS 2009 A generalized family-based association test for dichotomous traits. Am J Hum Genet 85 364 376
33. R Development Core Team 2010 R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing
34. LiYWillerCJDingJScheetPAbecasisGR 2010 MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34 816 834
35. HigginsJPThompsonSGDeeksJJAltmanDG 2003 Measuring inconsistency in meta-analyses. BMJ 327 557 560
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2012 Číslo 4
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
Najčítanejšie v tomto čísle
- A Coordinated Interdependent Protein Circuitry Stabilizes the Kinetochore Ensemble to Protect CENP-A in the Human Pathogenic Yeast
- Coordinate Regulation of Lipid Metabolism by Novel Nuclear Receptor Partnerships
- Defective Membrane Remodeling in Neuromuscular Diseases: Insights from Animal Models
- Formation of Rigid, Non-Flight Forewings (Elytra) of a Beetle Requires Two Major Cuticular Proteins