Sample size issues in multilevel logistic regression models
Autoři:
Amjad Ali aff001; Sabz Ali aff001; Sajjad Ahmad Khan aff001; Dost Muhammad Khan aff002; Kamran Abbas aff003; Alamgir Khalil aff004; Sadaf Manzoor aff001; Umair Khalil aff002
Působiště autorů:
Department of Statistics Islamia College, Peshawar, Pakistan
aff001; Department of Statistics, Abdul Wali Khan University Mardan, Pakistan
aff002; Department of Statistics, University of Azad Jammu & Kashmir, Muzaffarabad, Pakistan
aff003; Department of Statistics, University of Peshawar, Pakistan
aff004
Vyšlo v časopise:
PLoS ONE 14(11)
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pone.0225427
Souhrn
Educational researchers, psychologists, social, epidemiological and medical scientists are often dealing with multilevel data. Sometimes, the response variable in multilevel data is categorical in nature and needs to be analyzed through Multilevel Logistic Regression Models. The main theme of this paper is to provide guidelines for the analysts to select an appropriate sample size while fitting multilevel logistic regression models for different threshold parameters and different estimation methods. Simulation studies have been performed to obtain optimum sample size for Penalized Quasi-likelihood (PQL) and Maximum Likelihood (ML) Methods of estimation. Our results suggest that Maximum Likelihood Method performs better than Penalized Quasi-likelihood Method and requires relatively small sample under chosen conditions. To achieve sufficient accuracy of fixed and random effects under ML method, we established ‘‘50/50” and ‘‘120/50” rule respectively. On the basis our findings, a ‘‘50/60” and ‘‘120/70” rules under PQL method of estimation have also been recommended.
Klíčová slova:
Simulation and modeling – Psychological and psychosocial issues – Analysis of variance – Normal distribution – Statistical models – Psychologists – Generalized linear model – Social epidemiology
Zdroje
1. Raudenbush SW. and Bryk AS. “Hierarchical linear models” Applications and data analysis methods”. (vol.1) Sage, 2002.
2. Goldstein H., “Performance Indicators in Education”. Statistics in Society, London, Arnold, 1999, pp. 281–286.
3. Goldstein H., “Multilevel Statistical Models”. New York, Halstead Press, 1995.
4. Goldstein H., “Multilevel statistical models (3rd ed.)”. London, Hodder Arnold, 2003.
5. Snijders T. A. B., and Bosker R. J., “Multilevel analysis: An introduction to basic and Advanced multilevel modeling”,London, Sage, 1999.
6. Hox J. J., “Multilevel analysis: Techniques and applications”. Mahwah, NJ: Lawrence Erlbaum Associates, Inc., 2002.
7. Maas C.J. and Hox J.J., “Robustness issues in multilevel regression analysis”, Statistica Neerlandica. 2004; 58(2), 127–37.
8. Maas C.J. and Hox J.J., “Sufficient sample sizes for multilevel modeling”, Methodology, 2005, 1(3), 86–92.
9. Moineddin R., Matheson F.I. and Glazier R.H., “A simulation study of sample size for multilevel logistic regression models”, BMC medical research methodology, 2007, 7(1):34. doi: 10.1186/1471-2288-7-34 17634107
10. Paccagnella O., “Sample size and accuracy of estimates in multilevel models”, European Journal of Research Methods for the Behavioral and Social Sciences, 2011, 7(3), 111.
11. Zeng Q., Gu W., Zhang X., Wen H., Lee J. and Hao W., “Analyzing freeway crash severity using a Bayesian spatial generalized ordered logit model with conditional autoregressive priors”, Accident Analysis & Prevention, 2019,127, 87–95.
12. Zeng Q., Wen H., Huang H., Pei X. and Wong S. C., “Incorporating temporal correlation into a multivariate random parameters Tobit model for modeling crash rate by injury severity”, Transportmetrica A: transport science, 2018, 14(3), 177–191.
13. Zeng Q., Guo Q., Wong S. C., Wen H., Huang H. and Pei X., “Jointly modeling area-level crash rates by severity: a Bayesian multivariate random-parameters spatio-temporal Tobit regression”. Transportmetrica A: Transport Science, 2019, 15(2), 1867–1884.
14. Chen F., Peng H., Ma X., Liang J., Hao W. and Pan X., “Examining the safety of trucks under crosswind at bridge-tunnel section: A driving simulator study”. Tunnelling and Underground Space Technology, 2019, 92, 103034.
15. Chen F. and Chen S., “Injury severities of truck drivers in single-and multi-vehicle accidents on rural highways” Accident Analysis & Prevention, 2011, 43(5), 1677–1688.
16. Chen F., Song M., and Ma X., (2019). “Investigation on the injury severity of drivers in rear-end collisions between cars using a random parameters bivariate ordered probit model”, International journal of environmental research and public health, 2019, 16(14), 2632.
17. Scott Long J., “Regression models for categorical and limited dependent variables”. Advanced quantitative techniques in the social sciences, 1997, 7.
18. Agresti A., “Categorical Data Analysis,” Wiley, New York, 1990.
19. Hox J.J., “Applied Multilevel Analysis”. Amsterdam: TT-Publikaties, 1995.
20. McCullagh P. and Nelder J. A., “Generalised linear models”. Chapman and Hall. London, UK, 1989.
21. Bradley J. V., “Robustness”. British Journal of Mathematical and Statistical Psychology, 1978, 31(2), 144–152.
22. Snijders T. A. and Bosker R. J., “Standard errors and sample sizes for two-level research”, Journal of Educational and Behavioral Statistics, 1993, 18(3), 237–259.
23. Raudenbush S. W. and Liu X., “Statistical power and optimal design for multisite randomized trials”. Psychological methods, 2000, 5(2), 199. doi: 10.1037/1082-989x.5.2.199 10937329
Článok vyšiel v časopise
PLOS One
2019 Číslo 11
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Nejasný stín na plicích – kazuistika
- Masturbační chování žen v ČR − dotazníková studie
- Úspěšná resuscitativní thorakotomie v přednemocniční neodkladné péči
- Dlouhodobá recidiva a komplikace spojené s elektivní operací břišní kýly
Najčítanejšie v tomto čísle
- A daily diary study on maladaptive daydreaming, mind wandering, and sleep disturbances: Examining within-person and between-persons relations
- A 3’ UTR SNP rs885863, a cis-eQTL for the circadian gene VIPR2 and lincRNA 689, is associated with opioid addiction
- A substitution mutation in a conserved domain of mammalian acetate-dependent acetyl CoA synthetase 2 results in destabilized protein and impaired HIF-2 signaling
- Molecular validation of clinical Pantoea isolates identified by MALDI-TOF