Identification, Replication, and Functional Fine-Mapping of
Expression Quantitative Trait Loci in Primary Human Liver Tissue
The discovery of expression quantitative trait loci (“eQTLs”) can
help to unravel genetic contributions to complex traits. We identified genetic
determinants of human liver gene expression variation using two independent
collections of primary tissue profiled with Agilent
(n = 206) and Illumina (n = 60)
expression arrays and Illumina SNP genotyping (550K), and we also incorporated
data from a published study (n = 266). We found that
∼30% of SNP-expression correlations in one study failed to replicate
in either of the others, even at thresholds yielding high reproducibility in
simulations, and we quantified numerous factors affecting reproducibility. Our
data suggest that drug exposure, clinical descriptors, and unknown factors
associated with tissue ascertainment and analysis have substantial effects on
gene expression and that controlling for hidden confounding variables
significantly increases replication rate. Furthermore, we found that
reproducible eQTL SNPs were heavily enriched near gene starts and ends, and
subsequently resequenced the promoters and 3′UTRs for 14 genes and tested
the identified haplotypes using luciferase assays. For three genes, significant
haplotype-specific in vitro functional differences correlated
directly with expression levels, suggesting that many bona fide
eQTLs result from functional variants that can be mechanistically isolated in a
high-throughput fashion. Finally, given our study design, we were able to
discover and validate hundreds of liver eQTLs. Many of these relate directly to
complex traits for which liver-specific analyses are likely to be relevant, and
we identified dozens of potential connections with disease-associated loci.
These included previously characterized eQTL contributors to diabetes, drug
response, and lipid levels, and they suggest novel candidates such as a role for
NOD2 expression in leprosy risk and
C2orf43 in prostate cancer. In general, the work presented
here will be valuable for future efforts to precisely identify and functionally
characterize genetic contributions to a variety of complex traits.
Vyšlo v časopise:
Identification, Replication, and Functional Fine-Mapping of
Expression Quantitative Trait Loci in Primary Human Liver Tissue. PLoS Genet 7(5): e32767. doi:10.1371/journal.pgen.1002078
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1002078
Souhrn
The discovery of expression quantitative trait loci (“eQTLs”) can
help to unravel genetic contributions to complex traits. We identified genetic
determinants of human liver gene expression variation using two independent
collections of primary tissue profiled with Agilent
(n = 206) and Illumina (n = 60)
expression arrays and Illumina SNP genotyping (550K), and we also incorporated
data from a published study (n = 266). We found that
∼30% of SNP-expression correlations in one study failed to replicate
in either of the others, even at thresholds yielding high reproducibility in
simulations, and we quantified numerous factors affecting reproducibility. Our
data suggest that drug exposure, clinical descriptors, and unknown factors
associated with tissue ascertainment and analysis have substantial effects on
gene expression and that controlling for hidden confounding variables
significantly increases replication rate. Furthermore, we found that
reproducible eQTL SNPs were heavily enriched near gene starts and ends, and
subsequently resequenced the promoters and 3′UTRs for 14 genes and tested
the identified haplotypes using luciferase assays. For three genes, significant
haplotype-specific in vitro functional differences correlated
directly with expression levels, suggesting that many bona fide
eQTLs result from functional variants that can be mechanistically isolated in a
high-throughput fashion. Finally, given our study design, we were able to
discover and validate hundreds of liver eQTLs. Many of these relate directly to
complex traits for which liver-specific analyses are likely to be relevant, and
we identified dozens of potential connections with disease-associated loci.
These included previously characterized eQTL contributors to diabetes, drug
response, and lipid levels, and they suggest novel candidates such as a role for
NOD2 expression in leprosy risk and
C2orf43 in prostate cancer. In general, the work presented
here will be valuable for future efforts to precisely identify and functionally
characterize genetic contributions to a variety of complex traits.
Zdroje
1. KuCSLoyEYPawitanYChiaKS
2010
The pursuit of genome-wide association studies: where are we
now?
J Hum Genet
55
195
206
2. StoreyJDMadeoyJStroutJLWurfelMRonaldJ
2007
Gene-expression variation within and among human
populations.
Am J Hum Genet
80
502
509
3. BremRBYvertGClintonRKruglyakL
2002
Genetic dissection of transcriptional regulation in budding
yeast.
Science
296
752
755
4. SchadtEEMonksSADrakeTALusisAJCheN
2003
Genetics of gene expression surveyed in maize, mouse and
man.
Nature
422
297
302
5. MonksSALeonardsonAZhuHCundiffPPietrusiakP
2004
Genetic inheritance of gene expression in human cell
lines.
Am J Hum Genet
75
1094
1105
6. MorleyMMolonyCMWeberTMDevlinJLEwensKG
2004
Genetic analysis of genome-wide variation in human gene
expression.
Nature
430
743
747
7. CheungVGSpielmanRSEwensKGWeberTMMorleyM
2005
Mapping determinants of human gene expression by regional and
genome-wide association.
Nature
437
1365
1369
8. StrangerBENicaACForrestMSDimasABirdCP
2007
Population genomics of human gene expression.
Nat Genet
39
1217
1224
9. MontgomerySBSammethMGutierrez-ArcelusMLachRPIngleC
2010
Transcriptome genetics using second generation sequencing in a
Caucasian population.
Nature
464
773
777
10. PickrellJKMarioniJCPaiAADegnerJFEngelhardtBE
2010
Understanding mechanisms underlying human gene expression
variation with RNA sequencing.
Nature
464
768
772
11. WienkersLCHeathTG
2005
Predicting in vivo drug interactions from in vitro drug discovery
data.
Nat Rev Drug Discov
4
825
833
12. LinJH
2007
Pharmacokinetic and pharmacodynamic variability: a daunting
challenge in drug therapy.
Curr Drug Metab
8
109
136
13. KathiresanSMelanderOGuiducciCSurtiABurttNP
2008
Six new loci associated with blood low-density lipoprotein
cholesterol, high-density lipoprotein cholesterol or triglycerides in
humans.
Nat Genet
40
189
197
14. NicaACMontgomerySBDimasASStrangerBEBeazleyC
2010
Candidate causal regulatory effects by integration of expression
QTLs with complex trait genetic associations.
PLoS Genet
6
e1000895
doi:10.1371/journal.pgen.1000895
15. NicolaeDLGamazonEZhangWDuanSDolanME
2010
Trait-associated SNPs are more likely to be eQTLs: annotation to
enhance discovery from GWAS.
PLoS Genet
6
e1000888
doi:10.1371/journal.pgen.1000888
16. ChoyEYelenskyRBonakdarSPlengeRMSaxenaR
2008
Genetic analysis of human traits in vitro: drug response and gene
expression in lymphoblastoid cell lines.
PLoS Genet
4
e1000287
doi:10.1371/journal.pgen.1000287
17. CheslerEJLuLShouSQuYGuJ
2005
Complex trait analysis of gene expression uncovers polygenic and
pleiotropic networks that modulate nervous system function.
Nat Genet
37
233
242
18. GerritsALiYTessonBMBystrykhLVWeersingE
2009
Expression quantitative trait loci are highly sensitive to
cellular differentiation state.
PLoS Genet
5
e1000692
doi:10.1371/journal.pgen.1000692
19. AkeyJMBiswasSLeekJTStoreyJD
2007
On the design and analysis of gene expression studies in human
populations.
Nat Genet
39
807
808; author reply 808–809
20. LeekJTStoreyJD
2007
Capturing heterogeneity in gene expression studies by surrogate
variable analysis.
PLoS Genet
3
e161
doi:10.1371/journal.pgen.0030161
21. IdaghdourYStoreyJDJadallahSJGibsonG
2008
A genome-wide gene expression signature of environmental
geography in leukocytes of Moroccan Amazighs.
PLoS Genet
4
e1000052
doi:10.1371/journal.pgen.1000052
22. StegleOPartsLDurbinRWinnJ
2010
A Bayesian framework to account for complex non-genetic factors
in gene expression levels greatly increases power in eQTL
studies.
PLoS Comput Biol
6
e1000770
doi:10.1371/journal.pcbi.1000770
23. NicaACPartsLGlassDNisbetJBarrettA
2011
The Architecture of Gene Regulatory Variation across Multiple
Human Tissues: The MuTHER Study.
PLoS Genet
7
e1002003
doi:10.1371/journal.pgen.1002003
24. PeirceJLLiHWangJManlyKFHitzemannRJ
2006
How replicable are mRNA expression QTL?
Mamm Genome
17
643
656
25. DimasASDeutschSStrangerBEMontgomerySBBorelC
2009
Common regulatory variation impacts gene expression in a cell
type-dependent manner.
Science
325
1246
1250
26. DingJGudjonssonJELiangLStuartPELiY
2010
Gene expression in skin and lymphoblastoid cells: Refined
statistical method reveals extensive overlap in cis-eQTL
signals.
Am J Hum Genet
87
779
789
27. van NasAIngram-DrakeLSinsheimerJSWangSSSchadtEE
2010
Expression Quantitative Trait Loci: Replication, Tissue- and
Sex-Specificity in Mice.
Genetics
28. ServinBStephensM
2007
Imputation-based analysis of association studies: candidate
regions and quantitative traits.
PLoS Genet
3
e114
doi:10.1371/journal.pgen.0030114
29. GuanYStephensM
2008
Practical issues in imputation-based association
mapping.
PLoS Genet
4
e1000279
doi:10.1371/journal.pgen.1000279
30. MarchiniJHowieBMyersSMcVeanGDonnellyP
2007
A new multipoint method for genome-wide association studies by
imputation of genotypes.
Nat Genet
39
906
913
31. SchadtEEMolonyCChudinEHaoKYangX
2008
Mapping the genetic architecture of gene expression in human
liver.
PLoS Biol
6
e107
doi:10.1371/journal.pbio.0060107
32. GamazonERZhangWKonkashbaevADuanSKistnerEO
2010
SCAN: SNP and copy number annotation.
Bioinformatics
26
259
262
33. GoringHHTerwilligerJDBlangeroJ
2001
Large upward bias in estimation of locus-specific effects from
genomewide scans.
Am J Hum Genet
69
1357
1369
34. ZollnerSPritchardJK
2007
Overcoming the winner's curse: estimating penetrance
parameters from case-control data.
Am J Hum Genet
80
605
615
35. AlbertsRTerpstraPLiYBreitlingRNapJP
2007
Sequence polymorphisms cause many false cis
eQTLs.
PLoS ONE
2
e622
doi:10.1371/journal.pone.0000622
36. SpielmanRSBastoneLABurdickJTMorleyMEwensWJ
2007
Common genetic variants account for differences in gene
expression among ethnic groups.
Nat Genet
39
226
231
37. VeyrierasJBKudaravalliSKimSYDermitzakisETGiladY
2008
High-resolution mapping of expression-QTLs yields insight into
human gene regulation.
PLoS Genet
4
e1000214
doi:10.1371/journal.pgen.1000214
38. Le ClercSLimouSCoulongesCCarpentierWDinaC
2009
Genomewide association study of a rapid progression cohort
identifies new susceptibility alleles for AIDS (ANRS Genomewide Association
Study 03).
J Infect Dis
200
1194
1201
39. WeinmannLHockJIvacevicTOhrtTMutzeJ
2009
Importin 8 is a gene silencing factor that targets argonaute
proteins to distinct mRNAs.
Cell
136
496
507
40. HullJCampinoSRowlandsKChanMSCopleyRR
2007
Identification of common genetic variation that modulates
alternative splicing.
PLoS Genet
3
e99
doi:10.1371/journal.pgen.0030099
41. KwanTBenovoyDDiasCGurdSProvencherC
2008
Genome-wide analysis of transcript isoform variation in
humans.
Nat Genet
40
225
231
42. LiJZMengFTsavalerLEvansSJChoudaryPV
2007
Sample matching by inferred agonal stress in gene expression
analyses of the brain.
BMC Genomics
8
336
43. TrinkleinNDAldredSJSaldanhaAJMyersRM
2003
Identification and functional analysis of human transcriptional
promoters.
Genome Res
13
308
312
44. PatwardhanRPLeeCLitvinOYoungDLPe'erD
2009
High-resolution analysis of DNA regulatory elements by synthetic
saturation mutagenesis.
Nat Biotechnol
27
1173
1175
45. CarlsonCSAldredSFLeePKTracyRPSchwartzSM
2005
Polymorphisms within the C-reactive protein (CRP) promoter region
are associated with plasma CRP levels.
Am J Hum Genet
77
64
77
46. MusunuruKStrongAFrank-KamenetskyMLeeNEAhfeldtT
2010
From noncoding variant to phenotype via SORT1 at the 1p13
cholesterol locus.
Nature
466
714
719
47. TreismanROrkinSHManiatisT
1983
Specific transcription and RNA splicing defects in five cloned
beta-thalassaemia genes.
Nature
302
591
596
48. ZhangFRHuangWChenSMSunLDLiuH
2009
Genomewide association study of leprosy.
N Engl J Med
361
2609
2618
49. NgSBTurnerEHRobertsonPDFlygareSDBighamAW
2009
Targeted capture and massively parallel sequencing of 12 human
exomes.
Nature
461
272
276
50. BirneyEStamatoyannopoulosJADuttaAGuigoRGingerasTR
2007
Identification and analysis of functional elements in 1%
of the human genome by the ENCODE pilot project.
Nature
447
799
816
51. RiederMJReinerAPGageBFNickersonDAEbyCS
2005
Effect of VKORC1 haplotypes on transcriptional regulation and
warfarin dose.
N Engl J Med
352
2285
2293
52. TakataRAkamatsuSKuboMTakahashiAHosonoN
2010
Genome-wide association study identifies five new susceptibility
loci for prostate cancer in the Japanese population.
Nat Genet
42
751
754
53. HandelAEHandunnetthiLBerlangaAJWatsonCTMorahanJM
2010
The effect of single nucleotide polymorphisms from genome wide
association studies in multiple sclerosis on gene
expression.
PLoS ONE
5
e10142
doi:10.1371/journal.pone.0010142
54. GentlemanR
2005
Bioinformatics and computational biology solutions using R and
Bioconductor
New York
Springer Science+Business Media
xix, 473
55. RitchieMESilverJOshlackAHolmesMDiyagamaD
2007
A comparison of background correction methods for two-colour
microarrays.
Bioinformatics
23
2700
2707
56. BolstadBMIrizarryRAAstrandMSpeedTP
2003
A comparison of normalization methods for high density
oligonucleotide array data based on variance and bias.
Bioinformatics
19
185
193
57. SokalRRRohlfFJ
1995
Biometry : the principles and practice of statistics in biological
research
New York
W.H. Freeman
xix, 887
58. ScheetPStephensM
2006
A fast and flexible statistical model for large-scale population
genotype data: applications to inferring missing genotypes and haplotypic
phase.
Am J Hum Genet
78
629
644
59. NovembreJJohnsonTBrycKKutalikZBoykoAR
2008
Genes mirror geography within Europe.
Nature
456
98
101
60. PriceALPattersonNJPlengeRMWeinblattMEShadickNA
2006
Principal components analysis corrects for stratification in
genome-wide association studies.
Nat Genet
38
904
909
61. PurcellSNealeBTodd-BrownKThomasLFerreiraMA
2007
PLINK: a tool set for whole-genome association and
population-based linkage analyses.
Am J Hum Genet
81
559
575
62. Team RDC
2008
R: Language and environment for statistical
computing
63. BatesDMaechlerM
2010
Linear mixed-effects models using S4 classes
64. BaayenRH
2009
Data sets and functions with “Analyzing Linguistic Data: A
practical introduction to statistics.”
65. StoreyJD
2002
A direct approach to false discovery rates.
Journal of the Royal Statistical Society Series B-Statistical
Methodology
64
479
498
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2011 Číslo 5
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Nodal-Dependent Mesendoderm Specification Requires the Combinatorial Activities of FoxH1 and Eomesodermin
- SHINE Transcription Factors Act Redundantly to Pattern the Archetypal Surface of Arabidopsis Flower Organs
- STAT Is an Essential Activator of the Zygotic Genome in the Early Embryo
- A Nervous Origin for Fish Stripes