Adapting cognitive diagnosis computerized adaptive testing item selection rules to traditional item response theory
Autoři:
Miguel A. Sorrel aff001; Juan R. Barrada aff002; Jimmy de la Torre aff003; Francisco José Abad aff001
Působiště autorů:
Department of Social Psychology and Methodology, Universidad Autónoma de Madrid, Spain
aff001; Department of Psychology and Sociology, Universidad de Zaragoza, Spain
aff002; Faculty of Education, The University of Hong Kong, Hong Kong
aff003
Vyšlo v časopise:
PLoS ONE 15(1)
Kategorie:
Research Article
prolekare.web.journal.doi_sk:
https://doi.org/10.1371/journal.pone.0227196
Souhrn
Currently, there are two predominant approaches in adaptive testing. One, referred to as cognitive diagnosis computerized adaptive testing (CD-CAT), is based on cognitive diagnosis models, and the other, the traditional CAT, is based on item response theory. The present study evaluates the performance of two item selection rules (ISRs) originally developed in the CD-CAT framework, the double Kullback-Leibler information (DKL) and the generalized deterministic inputs, noisy “and” gate model discrimination index (GDI), in the context of traditional CAT. The accuracy and test security associated with these two ISRs are compared to those of the point Fisher information and weighted KL using a simulation study. The impact of the trait level estimation method is also investigated. The results show that the new ISRs, particularly DKL, could be used to improve the accuracy of CAT. Better accuracy for DKL is achieved at the expense of higher item overlap rate. Differences among the item selection rules become smaller as the test gets longer. The two CD-CAT ISRs select different types of items: items with the highest possible a parameter with DKL, and items with the lowest possible c parameter with GDI. Regarding the trait level estimator, expected a posteriori method is generally better in the first stages of the CAT, and converges with the maximum likelihood method when a medium to large number of items are involved. The use of DKL can be recommended in low-stakes settings where test security is less of a concern.
Klíčová slova:
Simulation and modeling – Algorithms – Probability distribution – Personality traits – Psychometrics – Bayesian method – Personality tests – Monte Carlo method
Zdroje
1. Magis D, Yan D, von Davier AA. Computerized adaptive and multistage testing with R. Cham: Springer International Publishing; 2017. doi: 10.1007/978-3-319-69218-0
2. Gibbons RD, Weiss DJ, Frank E, Kupfer D. Computerized adaptive diagnosis and testing of mental health disorders. Annu Rev Clin Psychol. 2016;12: 83–104. doi: 10.1146/annurev-clinpsy-021815-093634 26651865
3. Barney M, Fisher WP. Adaptive measurement and assessment. Annu Rev Organ Psychol Organ Behav. 2016;3: 469–490. doi: 10.1146/annurev-orgpsych-041015-062329
4. Stark S, Chernyshenko OS, Drasgow F, Nye CD, White LA, Heffner T, et al. From ABLE to TAPAS: A new generation of personality tests to support military selection and classification decisions. Mil Psychol. 2014;26: 153–164. doi: 10.1037/mil0000044
5. Gibbons RD, Weiss DJ, Pilkonis PA, Frank E, Moore T, Kim JB, et al. Development of a computerized adaptive test for depression. Arch Gen Psychiatry. 2012;69: 1104–1112. doi: 10.1001/archgenpsychiatry.2012.14 23117634
6. Zhan P, Jiao H, Liao D. Cognitive diagnosis modelling incorporating item response times. Br J Math Stat Psychol. 2018;71: 262–286. doi: 10.1111/bmsp.12114 28872185
7. Ma W. A diagnostic tree model for polytomous responses with multiple strategies. Br J Math Stat Psychol. 2019;72: 61–82. doi: 10.1111/bmsp.12137 29687453
8. Cheng Y. When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika. 2009;74: 619–632. doi: 10.1007/s11336-009-9123-2
9. Hsu C-L, Wang W-C, Chen S-Y. Variable-length computerized adaptive testing based on cognitive diagnosis models. Appl Psychol Meas. 2013;37: 563–582. doi: 10.1177/0146621613488642
10. Kaplan M, de la Torre J, Barrada JR. New item selection methods for cognitive diagnosis computerized adaptive testing. Appl Psychol Meas. 2015;39: 167–188. doi: 10.1177/0146621614554650 29881001
11. Xu G, Wang C, Shang Z. On initial item selection in cognitive diagnostic computerized adaptive testing. Br J Math Stat Psychol. 2016;69: 291–315. doi: 10.1111/bmsp.12072 27435032
12. Yigit HD, Sorrel MA, de la Torre J. Computerized adaptive testing for cognitively based multiple-choice data. Appl Psychol Meas. 2018; 014662161879866. doi: 10.1177/0146621618798665 31235984
13. Lehmann EL, Casella G. Theory of point estimation. New York: Springer-Verlag; 1998. doi: 10.1007/b98854
14. Lord FM. Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum; 1980.
15. Barrada JR, Abad FJ, Olea J. Varying the valuating function and the presentable bank in computerized adaptive testing. Span J Psychol. 2011;14: 500–508. doi: 10.5209/rev_sjop.2011.v14.n1.45 21568205
16. Bradley RA, Gart JJ. The asymptotic properties of ML estimators when sampling from associated populations. Biometrika. 1962;49: 205–214. doi: 10.2307/2333482
17. Barrada JR, Olea J, Ponsoda V, Abad FJ. Item selection rules in computerized adaptive testing: Accuracy and security. Methodology. 2009;5: 7–17. doi: 10.1027/1614-2241.5.1.7
18. Barrada JR, Olea J, Ponsoda V, Abad FJ. A method for the comparison of item selection rules in computerized adaptive testing. Appl Psychol Meas. 2010;34: 438–452. doi: 10.1177/0146621610370152
19. Chang H-H, Ying Z. A global information approach to computerized adaptive testing. Appl Psychol Meas. 1996;20: 213–229. doi: 10.1177/014662169602000303
20. de la Torre J, Chiu C-Y. A general method of empirical Q-matrix validation. Psychometrika. 2016;81: 253–273. doi: 10.1007/s11336-015-9467-8 25943366
21. Bock RD, Mislevy RJ. Adaptive EAP estimation of ability in a microcomputer environment. Appl Psychol Meas. 1982;6: 431–444. doi: 10.1177/014662168200600405
22. Samejima F. Estimation of latent ability using a response pattern of graded scores. Psychometrika. 1969;34: 1–97. doi: 10.1007/BF03372160
23. Wang T, Vispoel WP. Properties of ability estimation methods in computerized adaptive testing. J Educ Meas. 1998;35: 109–135. doi: 10.1111/j.1745-3984.1998.tb00530.x
24. van der Linden WJ. Bayesian item selection criteria for adaptive testing. Psychometrika. 1998;63: 201–216. doi: 10.1007/BF02294775
25. Way WD. Protecting the integrity of computerized testing item pools. Educ Meas Issues Pract. 1998;17: 17–27. doi: 10.1111/j.1745-3992.1998.tb00632.x
26. Chen S-Y, Ankenmann RD, Chang H-H. A comparison of item selection rules at the early stages of computerized adaptive testing. Appl Psychol Meas. 2000;24: 241–255. doi: 10.1177/01466210022031705
27. Dodd BG. The effect of item selection procedure and stepsize on computerized adaptive attitude measurement using the rating scale model. Appl Psychol Meas. 1990;14: 355–366. doi: 10.1177/014662169001400403
28. Veerkamp WJJ, Berger MPF. Some new item selection criteria for adaptive testing. J Educ Behav Stat. 1997;22: 203–226. doi: 10.3102/10769986022002203
29. van der Linden WJ, Veldkamp BP. Constraining item exposure in computerized adaptive testing with shadow tests. J Educ Behav Stat. 2004;29: 273–291. doi: 10.3102/10769986029003273
30. Barrada JR, Veldkamp BP, Olea J. Multiple maximum exposure rates in computerized adaptive testing. Appl Psychol Meas. 2009;33: 58–73. doi: 10.1177/0146621608315329
31. Chen S-Y, Ankenmann RD, Spray JA. The relationship between item exposure and test overlap in computerized adaptive testing. J Educ Meas. 2003;40: 129–145. doi: 10.1111/j.1745-3984.2003.tb01100.x
32. van der Linden WJ, Pashley PJ. Item selection and ability estimation in adaptive testing. In: van der Linden WJ, Glas CAW, editors. Elements of Adaptive Testing. New York, NY: Springer New York; 2010. pp. 3–30. doi: 10.1007/978-0-387-85461-8_1
33. Deshpande P, Sudeepthi Bl, Rajan S, Abdul Nazir C. Patient-reported outcomes: A new era in clinical research. Perspect Clin Res. 2011;2: 137–144. doi: 10.4103/2229-3485.86879 22145124
34. Cella D, Riley W, Stone A, Rothrock N, Reeve B, Yount S, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. J Clin Epidemiol. 2010;63: 1179–1194. doi: 10.1016/j.jclinepi.2010.04.011 20685078
35. Sawatzky R, Ratner PA, Kopec JA, Wu AD, Zumbo BD. The accuracy of computerized adaptive testing in heterogeneous populations: A mixture item-response theory analysis. Rapallo F, editor. PLOS ONE. 2016;11: e0150563. doi: 10.1371/journal.pone.0150563 26930348
36. Kearney N, McCann L, Norrie J, Taylor L, Gray P, McGee-Lennon M, et al. Evaluation of a mobile phone-based, advanced symptom management system (ASyMS) in the management of chemotherapy-related toxicity. Support Care Cancer. 2009;17: 437–444. doi: 10.1007/s00520-008-0515-0 18953579
37. Weiss DJ, Kingsbury GG. Application of computerized adaptive testing to educational problems. J Educ Meas. 1984;21: 361–375. doi: 10.1111/j.1745-3984.1984.tb01040.x
38. Magis D, Barrada JR. Computerized adaptive testing with R: Recent updates of the package catR. J Stat Softw. 2017;76. doi: 10.18637/jss.v076.c01
39. Barrada JR, Olea J, Ponsoda V, Abad FJ. Incorporating randomness in the Fisher information for improving item-exposure control in CATs. Br J Math Stat Psychol. 2008;61: 493–513. doi: 10.1348/000711007X230937 17681109
40. Barrada JR, Abad FJ, Veldkamp BP. Comparison of methods for controlling maximum exposure rates in computerized adaptive testing. Psicothema. 2009;21: 313–320. 19403088
41. Gierl MJ, Lai H, Turner SR. Using automatic item generation to create multiple-choice test items. Med Educ. 2012;46: 757–765. doi: 10.1111/j.1365-2923.2012.04289.x 22803753
42. Olea J, Barrada JR, Abad FJ, Ponsoda V, Cuevas L. Computerized adaptive testing: The capitalization on chance problem. Span J Psychol. 2012;15: 424–441. doi: 10.5209/rev_sjop.2012.v15.n1.37348 22379731
43. van der Linden WJ, Glas CAW. Capitalization on item calibration error in adaptive testing. Appl Meas Educ. 2000;13: 35–53. doi: 10.1207/s15324818ame1301_2
44. Patton JM, Cheng Y, Yuan K-H, Diao Q. The influence of item calibration error on variable-length computerized adaptive testing. Appl Psychol Meas. 2013;37: 24–40. doi: 10.1177/0146621612461727
45. Wang T, Hanson BA, Lau C-MA. Reducing bias in CAT trait estimation: A comparison of approaches. Appl Psychol Meas. 1999;23: 263–278. doi: 10.1177/01466219922031383
46. Wang C, Chang H-H, Boughton KA. Kullback–Leibler information and its applications in multi-dimensional adaptive testing. Psychometrika. 2011;76: 13–39. doi: 10.1007/s11336-010-9186-0
Článok vyšiel v časopise
PLOS One
2020 Číslo 1
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Nejasný stín na plicích – kazuistika
- Masturbační chování žen v ČR − dotazníková studie
- Těžké menstruační krvácení může značit poruchu krevní srážlivosti. Jaký management vyšetření a léčby je v takovém případě vhodný?
- Fixní kombinace paracetamol/kodein nabízí synergické analgetické účinky
Najčítanejšie v tomto čísle
- Psychometric validation of Czech version of the Sport Motivation Scale
- Comparison of Monocyte Distribution Width (MDW) and Procalcitonin for early recognition of sepsis
- Effects of supplemental creatine and guanidinoacetic acid on spatial memory and the brain of weaned Yucatan miniature pigs
- Accelerated sparsity based reconstruction of compressively sensed multichannel EEG signals