The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis
It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.
Vyšlo v časopise:
The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis. PLoS Genet 5(10): e32767. doi:10.1371/journal.pgen.1000628
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pgen.1000628
Souhrn
It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.
Zdroje
1. HomerN
SzelingerS
RedmanM
DugganD
TembeW
2008 Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 4 e1000167 doi:10.1371/journal.pgen.1000167
2. HillWG
RobertsonA
1968 Linkage disequilibrium in finite populations. Theor Appl Genet 38 226 231
3. HayesBJ
VisscherPM
GoddardME
2009 Increased accuracy of artificial selection by using the realized relationship matrix. Genetics Research 91 47 60
4. International Schizophrenia Consortium 2009 Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature (Epub July 1st 2009)
Štítky
Genetika Reprodukčná medicínaČlánok vyšiel v časopise
PLOS Genetics
2009 Číslo 10
- Je „freeze-all“ pro všechny? Odborníci na fertilitu diskutovali na virtuálním summitu
- Gynekologové a odborníci na reprodukční medicínu se sejdou na prvním virtuálním summitu
Najčítanejšie v tomto čísle
- Needles in the Haystack: Identifying Individuals Present in Pooled Genomic Data
- The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis
- Public Access to Genome-Wide Data: Five Views on Balancing Research with Privacy and Protection