Exploring trends in psychometrics literature through a structural topic model

Kübra Atalay Kabasakal; Rabia Akcan; Duygu Koçak

doi:10.21449/ijate.1653549

Research Article

Psikometri literatüründeki eğilimlerin yapısal konu modellemesi ile belirlenmesi

Year 2025, Volume: 12 Issue: 4, 942 - 962

Kübra Atalay Kabasakal , Rabia Akcan , Duygu Koçak

https://doi.org/10.21449/ijate.1653549

Abstract

Bilginin dijitalleşmesi, ilgili bilgiyi bulmayı ve keşfetmeyi giderek zorlaştırmakta; bu durum, büyük miktarda veriyi düzenlemeye, aramaya ve anlamaya yardımcı olacak hesaplama araçlarının geliştirilmesini gerekli kılmaktadır. Psikometri gibi büyük veri kümeleri içeren alanlarda, araştırma eğilimlerinin kapsamlı bir şekilde incelenmesi ve bu araçlar aracılığıyla çeşitli temaların önemini ve zaman içindeki değişimini anlamak, alanın dinamik yapısını değerlendirmek açısından büyük önem taşımaktadır. Bu çalışma, psikometri alanında önde gelen on bir dergide yayımlanan araştırmalarda ele alınan temaları keşfetmeyi ve bu temaların genel dağılımını belirlemeyi amaçlamaktadır. Bu doğrultuda, yapısal konu modellemesi yöntemi kullanılmıştır. Web of Science veri tabanından elde edilen 8.523 makale özetinin analizi, yayınlarda on dört farklı konunun varlığını ortaya koymuştur. "Ölçek Geliştirme ve Geçerlik" en baskın konu olarak öne çıkarken, "Değişen Madde Fonksiyonu" en az öne çıkan konu olmuştur. Konuların akademik dergiler arasındaki dağılımı, dergilerin psikometri araştırmalarının gelişimini ve evrimini şekillendirmedeki kritik rolünü vurgulamaktadır. Ayrıca, konu korelasyonlarının detaylı incelenmesiyle gelecekteki olası araştırma yönleri ve disiplinler arası çalışma alanları belirlenmiştir. Bu çalışma, psikometri alanındaki güncel gelişmeleri takip etmek isteyen araştırmacılar için önemli bir kaynak niteliği taşımakta olup, elde edilen bulgular bu alanda gelecekte yapılacak araştırmalara rehberlik edecek değerli içgörüler sunmaktadır.

Keywords

Yapısal Konu Modellemesi , Psikometri , Trend Analizi , Gizli Dirichlet Tahsisi , Metin Madenciliği

References

Ackerman, T.A., Bandalos, D.L., Briggs, D.C., Everson, H.T., Ho, A.D., Lottridge, S.M., Madison, M.J., Sinharay, S., Rodriguez, M.C., Russell, M., Von Davier, A.A., & Wind, S.A. (2023). Foundational competencies in educational measurement. Educational Measurement Issues and Practice, 43(3), 7–17. https://doi.org/10.1111/emip.12581
Anderson, D., Rowley, B., Stegenga, S., Irvin, P.S., & Rosenberg, J.M. (2020). Evaluating content‐related validity evidence using a text‐based machine learning procedure. Educational Measurement: Issues and Practice, 39(4), 53 64. https://doi.org/10.1111/emip.12314
Bai, X., Zhang, X., Li, K.X., Zhou, Y., & Yuen, K.F. (2021). Research topics and trends in the maritime transport: A structural topic model. Transport Policy, 102, 11 24. https://doi.org/10.1016/j.tranpol.2020.12.013
Banks, G.C., Woznyj, H.M., Wesslen, R.S., & Ross, R.L. (2018). A review of best practice recommendations for text analysis in R (and a User-Friendly app). Journal of Business and Psychology, 33(4), 445–459. https://doi.org/10.1007/s10869-017-9528-3
Bastola, M. N., & Hu, G. (2021). Chasing my supervisor all day long like a hungry child seeking her mother!: Students’ perceptions of supervisory feedback. Studies in Educational Evaluation, 70, 101055. https://doi.org/10.1016/j.stueduc.2021.101055
Blanca, M.J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2018). Non-normal data: Is ANOVA still a valid option? Psicothema, 30(4), 552 557. https://doi.org/10.7334/psicothema2018.245
Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. https://doi.org/10.5555/944919.944937
Blei, D.M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. https://doi.org/10.1145/2133806.2133826
Boon-Itt, S., & Skunkan, Y. (2020). Public perception of the COVID-19 pandemic on Twitter: Sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 6(4), e21978.
Briggs, D.C. (2024). Strive for measurement, set new standards, and try not to be evil. Journal of Educational and Behavioral Statistics, 49(5), 694 701. https://doi.org/10.3102/10769986241238479
Brooks, C., Burton, R., Van Der Kleij, F., Carroll, A., Olave, K., & Hattie, J. (2020). From fixing the work to improving the learner: An initial evaluation of a professional learning intervention using a new student-centred feedback model. Studies in Educational Evaluation, 68, 100943. https://doi.org/10.1016/j.stueduc.2020.100943
Borsboom, D. (2005). Measuring the mind: Conceptual issues in contemporary psychometrics. Cambridge University Press. https://doi.org/10.1017/CBO9780511613980
Buckhalt, J.A. (1999). Defending the science of mental ability and its central dogma. Review of Jensen on Intelligence g Factor. Psycoloquy, 10(23). http://www.cogsci.ecs.soton.ac.uk/cgi/psyc/newpsy?10.47
Buzick, H.M., Casabianca, J.M., & Gholson, M.L. (2023). Personalizing Large‐Scale Assessment in practice. Educational Measurement Issues and Practice, 42(2), 5–11. https://doi.org/10.1111/emip.12551
Chen, S., & Lei, P. (2005). Controlling item exposure and test overlap in computerized adaptive testing. Applied Psychological Measurement, 29(3), 204 217. https://doi.org/10.1177/0146621604271495
Chen, J., Chen, C., & Shih, C. (2013). Improving the control of type I error rate in assessing differential item functioning for hierarchical generalized linear model when impact is presented. Applied Psychological Measurement, 38(1), 18 36. https://doi.org/10.1177/0146621613488643
Choi, J.Y., Hwang, H., Yamamoto, M., Jung, K., & Woodward, T.S. (2016). A unified approach to functional principal component analysis and functional Multiple-Set canonical correlation. Psychometrika, 82(2), 427–441. https://doi.org/10.1007/s11336-015-9478-5
Cizek, G.J., Bowen, D., & Church, K. (2010). Sources of Validity Evidence for Educational and Psychological Tests: a Follow-Up Study. Educational and Psychological Measurement, 70(5), 732–743. https://doi.org/10.1177/0013164410379323
Cohn, S., & Huggins-Manley, A.C. (2019). Applying unidimensional models for semiordered data to scale data with neutral responses. Educational and Psychological Measurement, 80(2), 242–261. https://doi.org/10.1177/0013164419861143
Jones, L.V., & Thissen, D.M. (2006). A history and overview of psychometrics. In Handbook of statistics (pp. 1–27). https://doi.org/10.1016/s0169-7161(06)26001-2
Gao, X., & Sazara, C. (2023). Discovering mental health research topics with topic modeling. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2308.13569
Göral, S., Özkan, S., Sercekus, P., & Alataş, E. (2021). The validity and reliability of the Turkish version of the Attitudes to Fer-Tility and Childbearing Scale (AFCS). International Journal of Assessment Tools in Education, 8(4), 764 774. https://doi.org/10.21449/ijate.773132
Gregson, T. (1991). The separate constructs of communication satisfaction and job satisfaction. Educational and Psychological Measurement, 51(1), 39 48. https://doi.org/10.1177/0013164491511003
Groenen, P.J.F., & van der Ark, L.A. (2006). Visions of 70 years of psychometrics: the past, present, and future. Statistica Neerlandica, 60(2), 135–144. https://doi.org/10.1111/j.1467-9574.2006.00318.x
Guo, J., & Luh, W. (2008). Approximate sample size formulas for testing group mean differences when variances are unequal in One-Way ANOVA. Educational and Psychological Measurement, 68(6), 959–971. https://doi.org/10.1177/0013164408318759
Hidalgo, M.D., & LÓPez-Pina, J.A. (2004). Differential Item Functioning Detection and Effect Size: A Comparison between Logistic Regression and Mantel-Haenszel Procedures. Educational and Psychological Measurement, 64(6), 903 915. https://doi.org/10.1177/0013164403261769
Huynh, H. (1996). Decomposition of a Rasch partial credit item into independent binary and indecomposable trinary items. Psychometrika, 61(1), 31 39. https://doi.org/10.1007/bf02296957 Hwang, S., Flavin, E., & Lee, J.E. (2023). Exploring research trends of technology use in mathematics education: A scoping review using topic modeling. Education and Information Technologies, 28, 10753–10780. https://doi.org/10.1007/s10639-023-11603-0
Jiang, Y., Von Davier, A.A., & Chen, H. (2012). Evaluating equating results: percent relative error for chained kernel equating. Journal of Educational Measurement, 49(1), 39–58. https://doi.org/10.1111/j.1745-3984.2011.00159.x Jiang, X., & Ironsi, S.S. (2024). Do learners learn from corrective peer feedback? Insights from students. Studies in Educational Evaluation, 83, 101385. https://doi.org/10.1016/j.stueduc.2024.101385
Kiers, H.A.L. (1997). Three-mode orthomax rotation. Psychometrika, 62(4), 579–598. https://doi.org/10.1007/bf02294644
Kim, S. (2001). An evaluation of a Markov Chain Monte Carlo method for the Rasch model. Applied Psychological Measurement, 25(2), 163 176. https://doi.org/10.1177/01466210122031984
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355–381. https://doi.org/10.1111/j.1745-3984.2006.00021.x
Lederman, J. (2023). Validity and racial justice in educational assessment. Applied Measurement in Education, 36(3), 242 254. https://doi.org/10.1080/08957347.2023.2214654
Liu, J., & Low, A.C. (2008). A Comparison of the Kernel Equating Method with Traditional Equating Methods Using SAT® Data. Journal of Educational Measurement, 45(4), 309–323. https://doi.org/10.1111/j.1745-3984.2008.00067.x
MacDonald, P.L., & Gardner, R.C. (2000). Type I Error Rate Comparisons of Post Hoc Procedures for I j Chi-Square Tables. Educational and Psychological Measurement, 60(5), 735–754. https://doi.org/10.1177/00131640021970871
Martin, C.R., & Savage-McGlynn, E. (2013). A ‘good practice’ guide for the reporting of design and analysis for psychometric evaluation. Journal of Reproductive and Infant Psychology, 31(5), 449–455. https://doi.org/10.1080/02646838.2013.835036
Meeter, M. (2022). Predicting Retention in Higher Education from high-stakes Exams or School GPA. Educational Assessment, 28(1), 1 10. https://doi.org/10.1080/10627197.2022.2130748
Michell, J. (2022). The art of imposing measurement upon the mind: Sir Francis Galton and the genesis of the psychometric paradigm. Theory & Psychology, 32(3), 375 400. https://doi.org/10.1177/09593543211017671
Pan, Y., Livne, O., Wollack, J.A., & Sinharay, S. (2023). Item selection algorithm based on collaborative filtering for item exposure control. Educational Measurement Issues and Practice, 42(4), 6–18. https://doi.org/10.1111/emip.12578
Park, S., Steiner, P.M., & Kaplan, D. (2018). Identification and sensitivity analysis for average causal mediation effects with time-varying treatments and mediators: Investigating the underlying mechanisms of kindergarten retention policy. Psychometrika, 83(2), 298–320. https://doi.org/10.1007/s11336-018-9606-0
Plake, B.S., & Wise, L.L. (2014). What is the role and importance of the revised AERA, APA, NCME standards for educational and psychological testing? Educational Measurement Issues and Practice, 33(4), 4–12. https://doi.org/10.1111/emip.12045
Polatgil, M. (2023). Analyzing comments made to the Duolingo mobile application with topic modeling. International Journal of Computing and Digital Systems, 13(1), 223–230.
Randall, J., Slomp, D., Poe, M., & Oliveri, M.E. (2022). Disrupting White Supremacy in Assessment: Toward a Justice-Oriented, Antiracist validity framework. Educational Assessment, 27(2), 170–178. https://doi.org/10.1080/10627197.2022.2042682
Richardson, G.M., Bowers, J., Woodill, A.J., Barr, J.R., Gawron, J.M., & Levine, R.A. (2014). Topic models: A tutorial with R. International Journal of Semantic Computing, 8(01), 85-98.
Roberts, M.E., Stewart, B.M., Tingley, D., Lucas, C., Leder‐Luis, J., Gadarian, S.K., Albertson, B., & Rand, D.G. (2014). Structural topic models for Open‐Ended Survey Responses. American Journal of Political Science, 58(4), 1064 1082. https://doi.org/10.1111/ajps.12103
Roberts, M.E., Stewart, B.M., & Tingley, D. (2019). stm: An R package for structural topic models. Journal of Statistical Software, 91(2). https://doi.org/10.18637/jss.v091.i02
Rupp, A.A. (2018). Designing, evaluating, and deploying automated scoring systems with validity in mind: Methodological design decisions. Applied Measurement in Education, 31(3), 191–214. https://doi.org/10.1080/08957347.2018.1464448
Schuster, C. (2004). A Note on the Interpretation of Weighted Kappa and its Relations to Other Rater Agreement Statistics for Metric Scales. Educational and Psychological Measurement, 64(2), 243–253. https://doi.org/10.1177/0013164403260197
Shih, C., & Wang, W. (2009). Differential Item Functioning Detection Using the Multiple Indicators, Multiple Causes Method with a Pure Short Anchor. Applied Psychological Measurement, 33(3), 184–199. https://doi.org/10.1177/0146621608321758
Silge, J., & Robinson, D. (2016). tidytext: Text Mining and Analysis Using Tidy Data Principles in R. The Journal of Open-Source Software, 1(3), 37. https://doi.org/10.21105/joss.00037
Singh, J., & Gupta, V. (2017). A systematic review of text stemming techniques. Artificial Intelligence Review, 48(2), 157–217. https://doi.org/10.1007/s10462-016-9498-2
Sireci, S.G. (2013). Agreeing on validity arguments. Journal of Educational Measurement, 50(1), 99–104. https://doi.org/10.1111/jedm.12005
Sijtsma, K., & Pfadt, J.M. (2021). Part II: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha: Discussing lower bounds and correlated errors. Psychometrika, 86(4), 843-860.
Tharenou, P., & Terry, D.J. (1998). Reliability and validity of scores on scales to measure managerial aspirations. Educational and Psychological Measurement, 58(3), 475–492. https://doi.org/10.1177/0013164498058003008
Talloen, W., Moerkerke, B., Loeys, T., De Naeghel, J., Van Keer, H., & Vansteelandt, S. (2016). Estimation of indirect effects in the presence of unmeasured confounding for the Mediator–Outcome relationship in a multilevel 2-1-1 mediation model. Journal of Educational and Behavioral Statistics, 41(4), 359 391. https://doi.org/10.3102/1076998616636855
Tonidandel, S., Summerville, K.M., Gentry, W.A., & Young, S.F. (2021). Using structural topic modeling to gain insight into challenges faced by leaders. The Leadership Quarterly, 33(5), 101576. https://doi.org/10.1016/j.leaqua.2021.101576
Tunç, E.B., Parlak, S., Uluman, M., & Eryiğit, D. (2021). Development of the Hostility in Pandemic Scale (HPS): A Validity and Reliability study. International Journal of Assessment Tools in Education, 8(3), 475–486. https://doi.org/10.21449/ijate.837616
Wheeler, J.M., Cohen, A.S., & Wang, S. (2024). A comparison of latent semantic analysis and latent Dirichlet allocation in educational measurement. Journal of Educational and Behavioral Statistics, 49(5), 848–874. https://doi.org/10.3102/10769986231209446
Van Der Ark, L.A. (2005). Stochastic ordering of the latent trait by the sum score under various polytomous IRT models. Psychometrika, 70(2), 283–304. https://doi.org/10.1007/s11336-000-0862-3
Van der Linden, W.J., & Glas, C.A.W. (2000). Computerized adaptive testing: Theory and practice. Springer. https://doi.org/10.1007/978-1-4757-3224-0
Vitoratou, S., & Pickles, A. (2017). Psychometric analysis of the Mental Health Continuum-Short Form. Journal of Clinical Psychology, 73(10), 1307 1322. https://doi.org/10.1002/jclp.22422
Xiong, J., & Li, F. (2023). Bilevel topic model-based multitask learning for constructed-response multidimensional automated scoring and interpretation. Educational Measurement: Issues and Practice, 42(2), 42–61. https://doi.org/10.1111/emip.12550
Yavuz, S., Odabaş, M., & Özdemir, A. (2016). Öğrencilerin sosyoekonomik düzeylerinin TEOG matematik başarısına etkisi [Effect of socio-economic status on student’s TEOG mathematics achievement]. Journal of Measurement and Evaluation in Education and Psychology, 7(1), 85–95. https://doi.org/10.21031/epod.86531
Zagaria, A., & Lombardi, L. (2024). Bayesian versus frequentist approaches in psychometrics: a bibliometric analysis. Discover Psychology, 4, 61. https://doi.org/10.1007/s44202-024-00164-z
Zhan, P., Man, K., Wind, S.A., & Malone, J. (2022). Cognitive diagnosis modeling incorporating response times and fixation counts providing comprehensive feedback and accurate diagnosis. Journal of Educational and Behavioral Statistics, 47(6), 736–776. https://doi.org/10.3102/10769986221111085

Exploring trends in psychometrics literature through a structural topic model

Year 2025, Volume: 12 Issue: 4, 942 - 962

Kübra Atalay Kabasakal , Rabia Akcan , Duygu Koçak

https://doi.org/10.21449/ijate.1653549

Abstract

The digitalization of knowledge has made it increasingly challenging to find and discover relevant information, leading to the development of computational tools to assist in organizing, searching, and comprehending vast amounts of information. In fields like psychometrics, which involve large datasets, a comprehensive examination of research trends, as well as understanding the prominence of various themes and their evolution over time through these tools, is essential for assessing the dynamic structure of the field. This study aims to explore the themes addressed in publications from eleven leading journals in psychometrics and to determine the overall distribution of topics. To achieve this, structural topic modelling has been employed. A comprehensive analysis of 8,523 article abstracts sourced from the Web of Science database revealed the existence of fourteen topics within the publications. “Scale Development and Validation” emerged as the most prominent topic, whereas “Differential Item Functioning” was the least well-known. The distribution of topics across academic journals emphasized the key role journals play in shaping the development and evolution of psychometric research. Through further exploration of topic correlations, potential future research directions and between-topic research areas were revealed. This study serves as a valuable resource for researchers aiming to keep up with the latest advancements in psychometrics. The findings provide crucial insights to guide and shape future research in the field.

Keywords

Structural topic modelling , Psychometrics , Trend analysis , Latent dirichlet allocation , Text mining

References

Ackerman, T.A., Bandalos, D.L., Briggs, D.C., Everson, H.T., Ho, A.D., Lottridge, S.M., Madison, M.J., Sinharay, S., Rodriguez, M.C., Russell, M., Von Davier, A.A., & Wind, S.A. (2023). Foundational competencies in educational measurement. Educational Measurement Issues and Practice, 43(3), 7–17. https://doi.org/10.1111/emip.12581
Anderson, D., Rowley, B., Stegenga, S., Irvin, P.S., & Rosenberg, J.M. (2020). Evaluating content‐related validity evidence using a text‐based machine learning procedure. Educational Measurement: Issues and Practice, 39(4), 53 64. https://doi.org/10.1111/emip.12314
Bai, X., Zhang, X., Li, K.X., Zhou, Y., & Yuen, K.F. (2021). Research topics and trends in the maritime transport: A structural topic model. Transport Policy, 102, 11 24. https://doi.org/10.1016/j.tranpol.2020.12.013
Banks, G.C., Woznyj, H.M., Wesslen, R.S., & Ross, R.L. (2018). A review of best practice recommendations for text analysis in R (and a User-Friendly app). Journal of Business and Psychology, 33(4), 445–459. https://doi.org/10.1007/s10869-017-9528-3
Bastola, M. N., & Hu, G. (2021). Chasing my supervisor all day long like a hungry child seeking her mother!: Students’ perceptions of supervisory feedback. Studies in Educational Evaluation, 70, 101055. https://doi.org/10.1016/j.stueduc.2021.101055
Blanca, M.J., Alarcón, R., Arnau, J., Bono, R., & Bendayan, R. (2018). Non-normal data: Is ANOVA still a valid option? Psicothema, 30(4), 552 557. https://doi.org/10.7334/psicothema2018.245
Blei, D.M., Ng, A.Y., & Jordan, M.I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. https://doi.org/10.5555/944919.944937
Blei, D.M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84. https://doi.org/10.1145/2133806.2133826
Boon-Itt, S., & Skunkan, Y. (2020). Public perception of the COVID-19 pandemic on Twitter: Sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 6(4), e21978.
Briggs, D.C. (2024). Strive for measurement, set new standards, and try not to be evil. Journal of Educational and Behavioral Statistics, 49(5), 694 701. https://doi.org/10.3102/10769986241238479
Brooks, C., Burton, R., Van Der Kleij, F., Carroll, A., Olave, K., & Hattie, J. (2020). From fixing the work to improving the learner: An initial evaluation of a professional learning intervention using a new student-centred feedback model. Studies in Educational Evaluation, 68, 100943. https://doi.org/10.1016/j.stueduc.2020.100943
Borsboom, D. (2005). Measuring the mind: Conceptual issues in contemporary psychometrics. Cambridge University Press. https://doi.org/10.1017/CBO9780511613980
Buckhalt, J.A. (1999). Defending the science of mental ability and its central dogma. Review of Jensen on Intelligence g Factor. Psycoloquy, 10(23). http://www.cogsci.ecs.soton.ac.uk/cgi/psyc/newpsy?10.47
Buzick, H.M., Casabianca, J.M., & Gholson, M.L. (2023). Personalizing Large‐Scale Assessment in practice. Educational Measurement Issues and Practice, 42(2), 5–11. https://doi.org/10.1111/emip.12551
Chen, S., & Lei, P. (2005). Controlling item exposure and test overlap in computerized adaptive testing. Applied Psychological Measurement, 29(3), 204 217. https://doi.org/10.1177/0146621604271495
Chen, J., Chen, C., & Shih, C. (2013). Improving the control of type I error rate in assessing differential item functioning for hierarchical generalized linear model when impact is presented. Applied Psychological Measurement, 38(1), 18 36. https://doi.org/10.1177/0146621613488643
Choi, J.Y., Hwang, H., Yamamoto, M., Jung, K., & Woodward, T.S. (2016). A unified approach to functional principal component analysis and functional Multiple-Set canonical correlation. Psychometrika, 82(2), 427–441. https://doi.org/10.1007/s11336-015-9478-5
Cizek, G.J., Bowen, D., & Church, K. (2010). Sources of Validity Evidence for Educational and Psychological Tests: a Follow-Up Study. Educational and Psychological Measurement, 70(5), 732–743. https://doi.org/10.1177/0013164410379323
Cohn, S., & Huggins-Manley, A.C. (2019). Applying unidimensional models for semiordered data to scale data with neutral responses. Educational and Psychological Measurement, 80(2), 242–261. https://doi.org/10.1177/0013164419861143
Jones, L.V., & Thissen, D.M. (2006). A history and overview of psychometrics. In Handbook of statistics (pp. 1–27). https://doi.org/10.1016/s0169-7161(06)26001-2
Gao, X., & Sazara, C. (2023). Discovering mental health research topics with topic modeling. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2308.13569
Göral, S., Özkan, S., Sercekus, P., & Alataş, E. (2021). The validity and reliability of the Turkish version of the Attitudes to Fer-Tility and Childbearing Scale (AFCS). International Journal of Assessment Tools in Education, 8(4), 764 774. https://doi.org/10.21449/ijate.773132
Gregson, T. (1991). The separate constructs of communication satisfaction and job satisfaction. Educational and Psychological Measurement, 51(1), 39 48. https://doi.org/10.1177/0013164491511003
Groenen, P.J.F., & van der Ark, L.A. (2006). Visions of 70 years of psychometrics: the past, present, and future. Statistica Neerlandica, 60(2), 135–144. https://doi.org/10.1111/j.1467-9574.2006.00318.x
Guo, J., & Luh, W. (2008). Approximate sample size formulas for testing group mean differences when variances are unequal in One-Way ANOVA. Educational and Psychological Measurement, 68(6), 959–971. https://doi.org/10.1177/0013164408318759
Hidalgo, M.D., & LÓPez-Pina, J.A. (2004). Differential Item Functioning Detection and Effect Size: A Comparison between Logistic Regression and Mantel-Haenszel Procedures. Educational and Psychological Measurement, 64(6), 903 915. https://doi.org/10.1177/0013164403261769
Huynh, H. (1996). Decomposition of a Rasch partial credit item into independent binary and indecomposable trinary items. Psychometrika, 61(1), 31 39. https://doi.org/10.1007/bf02296957 Hwang, S., Flavin, E., & Lee, J.E. (2023). Exploring research trends of technology use in mathematics education: A scoping review using topic modeling. Education and Information Technologies, 28, 10753–10780. https://doi.org/10.1007/s10639-023-11603-0
Jiang, Y., Von Davier, A.A., & Chen, H. (2012). Evaluating equating results: percent relative error for chained kernel equating. Journal of Educational Measurement, 49(1), 39–58. https://doi.org/10.1111/j.1745-3984.2011.00159.x Jiang, X., & Ironsi, S.S. (2024). Do learners learn from corrective peer feedback? Insights from students. Studies in Educational Evaluation, 83, 101385. https://doi.org/10.1016/j.stueduc.2024.101385
Kiers, H.A.L. (1997). Three-mode orthomax rotation. Psychometrika, 62(4), 579–598. https://doi.org/10.1007/bf02294644
Kim, S. (2001). An evaluation of a Markov Chain Monte Carlo method for the Rasch model. Applied Psychological Measurement, 25(2), 163 176. https://doi.org/10.1177/01466210122031984
Kim, S. (2006). A comparative study of IRT fixed parameter calibration methods. Journal of Educational Measurement, 43(4), 355–381. https://doi.org/10.1111/j.1745-3984.2006.00021.x
Lederman, J. (2023). Validity and racial justice in educational assessment. Applied Measurement in Education, 36(3), 242 254. https://doi.org/10.1080/08957347.2023.2214654
Liu, J., & Low, A.C. (2008). A Comparison of the Kernel Equating Method with Traditional Equating Methods Using SAT® Data. Journal of Educational Measurement, 45(4), 309–323. https://doi.org/10.1111/j.1745-3984.2008.00067.x
MacDonald, P.L., & Gardner, R.C. (2000). Type I Error Rate Comparisons of Post Hoc Procedures for I j Chi-Square Tables. Educational and Psychological Measurement, 60(5), 735–754. https://doi.org/10.1177/00131640021970871
Martin, C.R., & Savage-McGlynn, E. (2013). A ‘good practice’ guide for the reporting of design and analysis for psychometric evaluation. Journal of Reproductive and Infant Psychology, 31(5), 449–455. https://doi.org/10.1080/02646838.2013.835036
Meeter, M. (2022). Predicting Retention in Higher Education from high-stakes Exams or School GPA. Educational Assessment, 28(1), 1 10. https://doi.org/10.1080/10627197.2022.2130748
Michell, J. (2022). The art of imposing measurement upon the mind: Sir Francis Galton and the genesis of the psychometric paradigm. Theory & Psychology, 32(3), 375 400. https://doi.org/10.1177/09593543211017671
Pan, Y., Livne, O., Wollack, J.A., & Sinharay, S. (2023). Item selection algorithm based on collaborative filtering for item exposure control. Educational Measurement Issues and Practice, 42(4), 6–18. https://doi.org/10.1111/emip.12578
Park, S., Steiner, P.M., & Kaplan, D. (2018). Identification and sensitivity analysis for average causal mediation effects with time-varying treatments and mediators: Investigating the underlying mechanisms of kindergarten retention policy. Psychometrika, 83(2), 298–320. https://doi.org/10.1007/s11336-018-9606-0
Plake, B.S., & Wise, L.L. (2014). What is the role and importance of the revised AERA, APA, NCME standards for educational and psychological testing? Educational Measurement Issues and Practice, 33(4), 4–12. https://doi.org/10.1111/emip.12045
Polatgil, M. (2023). Analyzing comments made to the Duolingo mobile application with topic modeling. International Journal of Computing and Digital Systems, 13(1), 223–230.
Randall, J., Slomp, D., Poe, M., & Oliveri, M.E. (2022). Disrupting White Supremacy in Assessment: Toward a Justice-Oriented, Antiracist validity framework. Educational Assessment, 27(2), 170–178. https://doi.org/10.1080/10627197.2022.2042682
Richardson, G.M., Bowers, J., Woodill, A.J., Barr, J.R., Gawron, J.M., & Levine, R.A. (2014). Topic models: A tutorial with R. International Journal of Semantic Computing, 8(01), 85-98.
Roberts, M.E., Stewart, B.M., Tingley, D., Lucas, C., Leder‐Luis, J., Gadarian, S.K., Albertson, B., & Rand, D.G. (2014). Structural topic models for Open‐Ended Survey Responses. American Journal of Political Science, 58(4), 1064 1082. https://doi.org/10.1111/ajps.12103
Roberts, M.E., Stewart, B.M., & Tingley, D. (2019). stm: An R package for structural topic models. Journal of Statistical Software, 91(2). https://doi.org/10.18637/jss.v091.i02
Rupp, A.A. (2018). Designing, evaluating, and deploying automated scoring systems with validity in mind: Methodological design decisions. Applied Measurement in Education, 31(3), 191–214. https://doi.org/10.1080/08957347.2018.1464448
Schuster, C. (2004). A Note on the Interpretation of Weighted Kappa and its Relations to Other Rater Agreement Statistics for Metric Scales. Educational and Psychological Measurement, 64(2), 243–253. https://doi.org/10.1177/0013164403260197
Shih, C., & Wang, W. (2009). Differential Item Functioning Detection Using the Multiple Indicators, Multiple Causes Method with a Pure Short Anchor. Applied Psychological Measurement, 33(3), 184–199. https://doi.org/10.1177/0146621608321758
Silge, J., & Robinson, D. (2016). tidytext: Text Mining and Analysis Using Tidy Data Principles in R. The Journal of Open-Source Software, 1(3), 37. https://doi.org/10.21105/joss.00037
Singh, J., & Gupta, V. (2017). A systematic review of text stemming techniques. Artificial Intelligence Review, 48(2), 157–217. https://doi.org/10.1007/s10462-016-9498-2
Sireci, S.G. (2013). Agreeing on validity arguments. Journal of Educational Measurement, 50(1), 99–104. https://doi.org/10.1111/jedm.12005
Sijtsma, K., & Pfadt, J.M. (2021). Part II: On the use, the misuse, and the very limited usefulness of Cronbach’s alpha: Discussing lower bounds and correlated errors. Psychometrika, 86(4), 843-860.
Tharenou, P., & Terry, D.J. (1998). Reliability and validity of scores on scales to measure managerial aspirations. Educational and Psychological Measurement, 58(3), 475–492. https://doi.org/10.1177/0013164498058003008
Talloen, W., Moerkerke, B., Loeys, T., De Naeghel, J., Van Keer, H., & Vansteelandt, S. (2016). Estimation of indirect effects in the presence of unmeasured confounding for the Mediator–Outcome relationship in a multilevel 2-1-1 mediation model. Journal of Educational and Behavioral Statistics, 41(4), 359 391. https://doi.org/10.3102/1076998616636855
Tonidandel, S., Summerville, K.M., Gentry, W.A., & Young, S.F. (2021). Using structural topic modeling to gain insight into challenges faced by leaders. The Leadership Quarterly, 33(5), 101576. https://doi.org/10.1016/j.leaqua.2021.101576
Tunç, E.B., Parlak, S., Uluman, M., & Eryiğit, D. (2021). Development of the Hostility in Pandemic Scale (HPS): A Validity and Reliability study. International Journal of Assessment Tools in Education, 8(3), 475–486. https://doi.org/10.21449/ijate.837616
Wheeler, J.M., Cohen, A.S., & Wang, S. (2024). A comparison of latent semantic analysis and latent Dirichlet allocation in educational measurement. Journal of Educational and Behavioral Statistics, 49(5), 848–874. https://doi.org/10.3102/10769986231209446
Van Der Ark, L.A. (2005). Stochastic ordering of the latent trait by the sum score under various polytomous IRT models. Psychometrika, 70(2), 283–304. https://doi.org/10.1007/s11336-000-0862-3
Van der Linden, W.J., & Glas, C.A.W. (2000). Computerized adaptive testing: Theory and practice. Springer. https://doi.org/10.1007/978-1-4757-3224-0
Vitoratou, S., & Pickles, A. (2017). Psychometric analysis of the Mental Health Continuum-Short Form. Journal of Clinical Psychology, 73(10), 1307 1322. https://doi.org/10.1002/jclp.22422
Xiong, J., & Li, F. (2023). Bilevel topic model-based multitask learning for constructed-response multidimensional automated scoring and interpretation. Educational Measurement: Issues and Practice, 42(2), 42–61. https://doi.org/10.1111/emip.12550
Yavuz, S., Odabaş, M., & Özdemir, A. (2016). Öğrencilerin sosyoekonomik düzeylerinin TEOG matematik başarısına etkisi [Effect of socio-economic status on student’s TEOG mathematics achievement]. Journal of Measurement and Evaluation in Education and Psychology, 7(1), 85–95. https://doi.org/10.21031/epod.86531
Zagaria, A., & Lombardi, L. (2024). Bayesian versus frequentist approaches in psychometrics: a bibliometric analysis. Discover Psychology, 4, 61. https://doi.org/10.1007/s44202-024-00164-z
Zhan, P., Man, K., Wind, S.A., & Malone, J. (2022). Cognitive diagnosis modeling incorporating response times and fixation counts providing comprehensive feedback and accurate diagnosis. Journal of Educational and Behavioral Statistics, 47(6), 736–776. https://doi.org/10.3102/10769986221111085

There are 64 citations in total.

Details

Primary Language	English
Subjects	Measurement Theories and Applications in Education and Psychology
Journal Section	Articles
Authors	Kübra Atalay Kabasakal 0000-0002-3580-5568 Rabia Akcan 0000-0003-3025-774X Duygu Koçak 0000-0003-3211-0426
Early Pub Date	October 1, 2025
Publication Date	October 10, 2025
Submission Date	March 7, 2025
Acceptance Date	July 8, 2025
Published in Issue	Year 2025 Volume: 12 Issue: 4

Cite

APA	Atalay Kabasakal, K., Akcan, R., & Koçak, D. (2025). Exploring trends in psychometrics literature through a structural topic model. International Journal of Assessment Tools in Education, 12(4), 942-962. https://doi.org/10.21449/ijate.1653549

Article Files

Full Text

23823 23825 23824