Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods

İlhan Koyuncu; Abdullah Faruk Kılıç

doi:10.21449/ijate.880914

Araştırma Makalesi

Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods

Yıl 2021, Cilt: 8 Sayı: 4, 928 - 947, 04.12.2021

İlhan Koyuncu , Abdullah Faruk Kılıç

https://doi.org/10.21449/ijate.880914

Cited By: 3

Öz

In exploratory factor analysis, although the researchers decide which items belong to which factors by considering statistical results, the decisions taken sometimes can be subjective in case of having items with similar factor loadings and complex factor structures. The aim of this study was to examine the validity of classifying items into dimensions with exploratory graph analysis (EGA), which has been used in determining the number of dimensions in recent years and machine learning methods. A Monte Carlo simulation was performed with a total number of 96 simulation conditions including average factor loadings, sample size, number of items per dimension, number of dimensions, and distribution of data. Percent correct and Kappa concordance values were used in the evaluation of the methods. When the findings obtained for different conditions were evaluated together, it was seen that the machine learning methods gave results comparable to those of EGA. Machine learning methods showed high performance in terms of percent correct values, especially in small and medium-sized samples. In all conditions where the average factor loading was .70, BayesNet, Naive Bayes, RandomForest, and RseslibKnn methods showed accurate classification performances above 80% like EGA method. BayesNet, Simple Logistic and RBFNetwork methods also demonstrated acceptable or high performance under many conditions. In general, Kappa concordance values also supported these results. The results revealed that machine learning methods can be used for similar conditions to examine whether the distribution of items across factors is done accurately or not.

Anahtar Kelimeler

PISA, Machine learning, Exploratory factor analysis, Exploratory graph analysis, Monte Carlo simulation, Scale development

Kaynakça

Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning 6, 37-66.
Akpınar, H. (2014). Veri madenciliği veri analizi [Data mining data analysis]. Papatya Yayınları.
Alpaydin, E. (2010). Introduction to machine learning: Adaptive computation and machine learning series. MIT Press.
Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological Methods, 24(4), 468–491. https://doi:10.1037/met0000200
Azqueta-Gavaldón, A. (2017). Developing news-based economic policy uncertainty index with unsupervised machine learning. Economics Letters, 158, 47-50.
Baker, R. S. J. (2010). Machine learning for education. International Encyclopedia of Education, 7(3), 112-118.
Baldi, P., & Hornik, K. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2, 53-58.
Bandalos, D. L., & Leite, W. (2013). Use of Monte Carlo studies in structural equation modeling research. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed.). Information Age.
Barker, K., Trafalis, T., & Rhoads, T. R. (2004). Learning from student data. In Proceedings of the 2004 Systems and Information Engineering Design Symposium (pp. 79-86). IEEE.
Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186 203. https://doi.org/10.1207/s15328007sem1302_2
Beleites, C., Neugebauer, U., Bocklitz, T., Krafft, C., & Popp, J. (2013). Sample size planning for classification models. Analytica Chimica Acta, 760, 25-33.
Belvederi Murri, M., Caruso, R., Ounalli, H., Zerbinati, L., Berretti, E., Costa, S., … Grassi, L. (2020). The relationship between demoralization and depressive symptoms among patients from the general hospital: network and exploratory graph analysis: Demoralization and depression symptom network. Journal of Affective Disorders, 276(June), 137–146. https://doi.org/10.1016/j.jad.2020.06.074
Berens, J., Schneider, K., Gortz, S., Oster, S., & Burghoff, J. (2019). Early detection of students at risk - predicting student dropouts using administrative student data from German universities and machine learning methods. Journal of Educational Machine learning, 11(3), 1-41. https://doi.org/10.5281/zenodo.3594771
Bouckaert, R. R. (2008). Bayesian network classifiers in Weka for Version 3-5-7. Artificial Intelligence Tools, 11(3), 369-387.
Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2020). WEKA manual for version 3-9-5. University of Waikato.
Brain, D., & Webb, G. (1999). On the effect of data set size on bias and variance in classification learning. In Proceedings of the Fourth Australian Knowledge Acquisition Workshop, University of New South Wales (pp. 117-128), December 5-6, Sydney, Australia.
Branco, P., Torgo, L., & Ribeiro, R. (2015). A survey of predictive modelling under imbalanced distributions. arXiv preprint arXiv:1505.01658.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). The Guilford.
Bulut, O., & Yavuz, H. C. (2019). Educational machine learning: A tutorial for the" Rattle" package in R. International Journal of Assessment Tools in Education, 6(5), 20-36.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245–276. https://doi.org/10.1207/s15327906mbr0102_10
Chattopadhyay, M., Dan, P. K., & Mazumdar, S. (2011). Principal component analysis and self-organizing map for visual clustering of machine-part cell formation in cellular manufacturing system. In Systems Research Forum (Vol. 5, No. 01, pp. 25-51). World Scientific Publishing Company.
Chou, C. P., & Bentler, P. M. (1995). Estimates and tests in structural equation modeling. In Rich H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications. Sage.
Chu, C., Hsu, A. L., Chou, K. H., Bandettini, P., Lin, C., & Alzheimer's Disease Neuroimaging Initiative (2012). Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage, 60(1), 59-70.
Cleary, J. G., & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. In Machine Learning Proceedings 1995 (pp. 108-114). Morgan Kaufmann.
Cohen, J. (1960). A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20(1), 37-46.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 27–29. https://doi.org/10.1.1.110.9154
Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16-29. https://doi.org/10.1037/1082-989X.1.1.16
Efron, B. (1983). Estimating the error rate of a prediction rule: Improvements on crossvalidation. J. Amer. Stat. Ass., 78, 316–331.
Egan, J. P. (1975). Signal detection theory and ROC analysis. Academic Press.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272–299. https://doi.org/10.1037/1082-989X.4.3.272
Figueroa, R. L., Zeng-Treitler, Q., Kandula, S., & Ngo, L. H. (2012). Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making, 12(1), 8.
Finney, S. J., & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed., pp. 439–492). Charlotte, NC: IAP.
Fischer, R., & Alfons Karl, J. (2020). The network architecture of individual differences: Personality, reward-sensitivity, and values. Personality and Individual Differences, 160(February), 109922. https://doi.org/10.1016/j.paid.2020.109922
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382.
Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7(3), 286 299. https://doi.org/10.1037/1040-3590.7.3.286
Golino, H. F., & Christensen, A. P. (2020). EGAnet: Exploratory Graph Analysis -- A framework for estimating the number of dimensions in multivariate data using network psychometrics. Retrieved from https://CRAN.R-project.org/package=EGAnet
Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLOS ONE, 12(6), 1 26. https://doi.org/10.1371/journal.pone.0174035
Golino, H. F., Moulder, R., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., … Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research, 1–29. https://doi.org/10.1080/00273171.2020.1779642
Golino, H. F., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., … Martinez-Molina, A. (2020). Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25(3), 292–320. https://doi.org/10.1037/met0000255
Goretzko, D., & Bühner, M. (2020). One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis. Psychological Methods, 25(6), 776–786. https://doi.org/10.1037/met0000262
Gorsuch, R. L. (1974). Factor analysis. W. B. Saunders.
Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns. Psychological Bulletin, 103(2), 265–275.
Guess, A., Munger, K., Nagler, J., & Tucker, J. (2019). How accurate are survey responses on social media and politics?. Political Communication, 36(2), 241-258.
Güre, Ö. B., Kayri, M., & Erdoğan, F. (2020). Analysis of factors effecting PISA 2015 mathematics literacy via educational machine learning. Education and Science, 45(202), 393-415.
Grimmer, J. (2015). We are all social scientists now: How big data, machine learning, and causal inference work together. PS, Political Science & Politics, 48(1), 80.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Peter, R., & Witten, I. H. (2009). The WEKA machine learning software: An update. SIGKDD Explorations, 11(1), 10-18.
Hamalainen, W., & Vinni, M. (2006). Comparison of machine learning methods for intelligent tutoring systems. In Proceedings of International Conference on Intelligent Tutoring Systems (pp. 525-534). Springer Berlin/Heidelberg.
Han, J., J. Pei, & Kamber, M. (2011). Machine learning: Concepts and techniques. Elsevier.
Hartmann, D. P. (1977). Considerations in the choice of interobserver reliability estimates. Journal of Applied Behavior Analysis, 10(1), 1311156. https://doi.org/10.1901/jaba.1977.10-103
Hegde, J., & Rokseth, B. (2020). Applications of machine learning methods for engineering risk assessment–A review. Safety Science, 122, 104492.
Heydari, S. S., & Mountrakis, G. (2018). Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites. Remote Sensing of Environment, 204, 648-658.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. https://doi.org/10.1007/BF02289447
Howard, M. C. (2016). A review of exploratory factor analysis decisions and overview of current practices: What we are doing and how can we improve? International Journal of Human Computer Interaction, 32(1), 51 62. https://doi.org/10.1080/10447318.2015.1087664
Iantovics, L. B., Rotar, C., & Morar, F. (2019). Survey on establishing the optimal number of factors in exploratory factor analysis applied to machine learning. Wiley Interdisciplinary Reviews: Machine learning and Knowledge Discovery, 9(2), 1 20. https://doi.org/10.1002/widm.1294
Ibarguren, I., Pérez, J. M., Muguerza, J., Gurrutxaga, I., & Arbelaitz, O. (2015). Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems, 79, 51-67. https://doi.org/10.1016/j.knosys.2014.12.023
John, G. H., & Langley P. (1995). Estimating continuous distributions in Bayesian classifiers. In P. Besnard & S. Hanks (Eds.), Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345). San Francisco, Morgan Kaufmann.
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: trends, perspectives, and prospects. Science, 349(6245), 255-260, https://doi.org/10.1126/science.aaa8415
Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20(1), 141 151. https://doi.org/10.1177/001316446002000116
Kılıç, A. F., & Koyuncu, İ. (2017). Ölçek uyarlama çalışmalarının yapı geçerliği açısından incelenmesi [Examination of scale adaptation studies in terms of construct validity]. In Ö. Demirel & S. Dinçer (Eds.), Küreselleşen dünyada eğitim [Education in a globalizing world] (pp. 1202–1205). Pegem Akademi.
Kjellström, S., & Golino, H. (2019). Mining concepts of health responsibility using text mining and exploratory graph analysis. Scandinavian Journal of Occupational Therapy, 26(6), 395–410. https://doi.org/10.1080/11038128.2018.1455896
Kline, P. (1994). An easy guide to factor analysis. Routledge.
Koyuncu, İ., & Gelbal, S. (2020). Comparison of machine learning classification algorithms on educational data under different conditions. Journal of Measurement and Evaluation in Education and Psychology, 11(4), 325-345.
Koyuncu, İ., & Kılıç, A. F. (2019). The use of exploratory and confirmatory factor analyses: A document analysis. Education and Science, 44(198), 361 388. https://doi.org/10.15390/EB.2019.7665
Kuhn, M. (2020). caret: Classification and Regression Training. Retrieved from https://cran.r-project.org/package=caret
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1-11.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174.
Landwehr, N., Hall, M., & Frank, E. (2006). Logistic model trees. Kluwer Academic Publishers.
Larose, D. T., & Larose, C.D. (2014). Discovering knowledge in data: An introduction to machine learning. John Wiley and Sons.
Li, C.-H. (2016a). Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behavior Research Methods, 48(3), 936–949. https://doi.org/10.3758/s13428-015-0619-7
Li, C.-H. (2016b). The performance of ML, DWLS, and ULS estimation with robust corrections in structural equation models with ordinal variables. Psychological Methods, 21(3), 369 387. https://doi.org/10.1037/met0000093
Li, N., Qi, J., Wang, P., Zhang, X., Zhang, T., & Li, H. (2019). Quantitative structure–activity relationship (QSAR) study of carcinogenicity of polycyclic aromatic hydrocarbons (PAHs) in atmospheric particulate matter by random forest (RF). Analytical Methods, 11(13), 1816-1821.
Mele, M., & Magazzino, C. (2020). A machine learning analysis of the relationship among iron and steel industries, air pollution, and economic growth in China. Journal of Cleaner Production, 277, 123293.
Minaei-Bidgoli, B., D.A. Kashy, G. Kortemeyer, & W. Punch (2003). Predicting student performance: An application of machine learning methods with an educational web-based system. In Proceedings of 33rd Frontiers in Education Conference, (pp. 13-18). Westminster, CO.
Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.
Nghe, N. T., Janecek, P., & Haddawy, P. (2007). A comparative analysis of techniques for predicting academic performance. In Frontiers in Education Conference-Global Engineering: Knowledge Without Borders, Opportunities Without Passports, (pp. T2G-7). IEEE.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd. ed.). McGraw-Hill.
Osborne, J. W. (2015). What is rotating in exploratory factor analysis? Practical Assessment Research & Evaluation, 20(2), 1–7.
Panayiotou, M., Santos, J., Black, L., & Humphrey, N. (2020). Exploring the dimensionality of the social skills improvement system using exploratory graph analysis and bifactor-(S−1) modeling. Assessment, 1-15. https://doi.org/10.1177/1073191120971351
Pérez, J. M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., & Martín, J. I. (2007). Combining multiple class distribution modified subsamples in a single tree. Pattern Recognition Letters, 28(4), 414-422. https://doi.org/10.1016/j.patrec.2006.08.013
Pu, Y., Apel, D. B., & Hall, R. (2020). Using machine learning approach for microseismic events recognition in underground excavations: Comparison of ten frequently-used models. Engineering Geology, 268, 105519.
Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers, Inc.
R Core Team. (2020). R: A Language and Environment for Statistical Computing. Vienna, Austria. Retrieved from https://www.r-project.org/
Reich, Y., & Barai, S. V. (1999). Evaluating machine learning models for engineering problems. Artificial Intelligence in Engineering, 13(3), 257-272.
Rijsbergen CV. (1979). Information retrieval (2nd ed.). Butterworth.
Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2013). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 21(1), 135- 146.
Romero, C., & Ventura, S. (2013). Machine learning in education. WIREs Machine learning Knowledge Discovery 3(1), 12- 27.
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36.
Shao, L., Fan, X., Cheng, N., Wu, L., & Cheng, Y. (2013). Determination of minimum training sample size for microarray-based cancer outcome prediction–an empirical assessment. PloS one, 8(7), e68579. https://doi.org/10.1371/journal.pone.0068579
Sumner, M., Frank, E., & Hall, M. (2005, October). Speeding up logistic model tree induction. In European conference on principles of machine learning and knowledge discovery (pp. 675-683). Springer, Berlin, Heidelberg.
Sun, Y., Kamel, M. S., & Wang, Y. (2006). Boosting for learning multiple classes with imbalanced class distribution. In Sixth international conference on data mining (ICDM'06) (pp. 592-602). IEEE.
Tabachnik, B. G., & Fidell, L. S. (2012). Using multivariate statistics (6th ed.). Pearson.
Tezbaşaran, E., & Gelbal, S. (2018). Temel bileşenler analizi ve yapay sinir ağı modellerinin ölçek geliştirme sürecinde kullanılabilirliğinin incelenmesi [An investigation on usability of principal component analysis and artificial neural network models in the process of scale development]. Mersin University Journal of the Faculty of Education, 14(1), 225-252.
Timmerman, M. E., & Lorenzo-Seva, U. (2011). Dimensionality assessment of ordered polytomous items with parallel analysis. Psychological Methods, 16(2), 209 220. https://doi.org/10.1037/a0023353
West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues and applications (pp. 56-75). Sage.
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2017). Machine learning: Practical machine learning tools and techniques (4th Edition). Morgan Kaufmann.
Wojna, A., Latkowski, R. (2018): Rseslib 3: Open source library of rough set and machine learning methods. In: Proceedings of the International Joint Conference on Rough Set (LNCS, vol. 11103, pp. 162-176). Springer.
Wojna, A., Latkowski, R., Kowalski, (2019). RSESLIB: User guide. Retrieved from http://rseslib.mimuw.edu.pl/rseslib.pdf
Zhang, F., & Yang, X. (2020). Improving land cover classification in an urbanized coastal area by random forests: The role of variable selection. Remote Sensing of Environment, 251, 112105. https://doi.org/10.1016/j.rse.2020.112105

Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods

Yıl 2021, Cilt: 8 Sayı: 4, 928 - 947, 04.12.2021

İlhan Koyuncu , Abdullah Faruk Kılıç

https://doi.org/10.21449/ijate.880914

Cited By: 3

Öz

Anahtar Kelimeler

PISA, Machine learning, Exploratory factor analysis, Exploratory graph analysis, Monte Carlo simulation, Scale development

Kaynakça

Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning 6, 37-66.
Akpınar, H. (2014). Veri madenciliği veri analizi [Data mining data analysis]. Papatya Yayınları.
Alpaydin, E. (2010). Introduction to machine learning: Adaptive computation and machine learning series. MIT Press.
Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological Methods, 24(4), 468–491. https://doi:10.1037/met0000200
Azqueta-Gavaldón, A. (2017). Developing news-based economic policy uncertainty index with unsupervised machine learning. Economics Letters, 158, 47-50.
Baker, R. S. J. (2010). Machine learning for education. International Encyclopedia of Education, 7(3), 112-118.
Baldi, P., & Hornik, K. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2, 53-58.
Bandalos, D. L., & Leite, W. (2013). Use of Monte Carlo studies in structural equation modeling research. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed.). Information Age.
Barker, K., Trafalis, T., & Rhoads, T. R. (2004). Learning from student data. In Proceedings of the 2004 Systems and Information Engineering Design Symposium (pp. 79-86). IEEE.
Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186 203. https://doi.org/10.1207/s15328007sem1302_2
Beleites, C., Neugebauer, U., Bocklitz, T., Krafft, C., & Popp, J. (2013). Sample size planning for classification models. Analytica Chimica Acta, 760, 25-33.
Belvederi Murri, M., Caruso, R., Ounalli, H., Zerbinati, L., Berretti, E., Costa, S., … Grassi, L. (2020). The relationship between demoralization and depressive symptoms among patients from the general hospital: network and exploratory graph analysis: Demoralization and depression symptom network. Journal of Affective Disorders, 276(June), 137–146. https://doi.org/10.1016/j.jad.2020.06.074
Berens, J., Schneider, K., Gortz, S., Oster, S., & Burghoff, J. (2019). Early detection of students at risk - predicting student dropouts using administrative student data from German universities and machine learning methods. Journal of Educational Machine learning, 11(3), 1-41. https://doi.org/10.5281/zenodo.3594771
Bouckaert, R. R. (2008). Bayesian network classifiers in Weka for Version 3-5-7. Artificial Intelligence Tools, 11(3), 369-387.
Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2020). WEKA manual for version 3-9-5. University of Waikato.
Brain, D., & Webb, G. (1999). On the effect of data set size on bias and variance in classification learning. In Proceedings of the Fourth Australian Knowledge Acquisition Workshop, University of New South Wales (pp. 117-128), December 5-6, Sydney, Australia.
Branco, P., Torgo, L., & Ribeiro, R. (2015). A survey of predictive modelling under imbalanced distributions. arXiv preprint arXiv:1505.01658.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). The Guilford.
Bulut, O., & Yavuz, H. C. (2019). Educational machine learning: A tutorial for the" Rattle" package in R. International Journal of Assessment Tools in Education, 6(5), 20-36.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245–276. https://doi.org/10.1207/s15327906mbr0102_10
Chattopadhyay, M., Dan, P. K., & Mazumdar, S. (2011). Principal component analysis and self-organizing map for visual clustering of machine-part cell formation in cellular manufacturing system. In Systems Research Forum (Vol. 5, No. 01, pp. 25-51). World Scientific Publishing Company.
Chou, C. P., & Bentler, P. M. (1995). Estimates and tests in structural equation modeling. In Rich H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications. Sage.
Chu, C., Hsu, A. L., Chou, K. H., Bandettini, P., Lin, C., & Alzheimer's Disease Neuroimaging Initiative (2012). Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage, 60(1), 59-70.
Cleary, J. G., & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. In Machine Learning Proceedings 1995 (pp. 108-114). Morgan Kaufmann.
Cohen, J. (1960). A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20(1), 37-46.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 27–29. https://doi.org/10.1.1.110.9154
Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16-29. https://doi.org/10.1037/1082-989X.1.1.16
Efron, B. (1983). Estimating the error rate of a prediction rule: Improvements on crossvalidation. J. Amer. Stat. Ass., 78, 316–331.
Egan, J. P. (1975). Signal detection theory and ROC analysis. Academic Press.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272–299. https://doi.org/10.1037/1082-989X.4.3.272
Figueroa, R. L., Zeng-Treitler, Q., Kandula, S., & Ngo, L. H. (2012). Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making, 12(1), 8.
Finney, S. J., & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed., pp. 439–492). Charlotte, NC: IAP.
Fischer, R., & Alfons Karl, J. (2020). The network architecture of individual differences: Personality, reward-sensitivity, and values. Personality and Individual Differences, 160(February), 109922. https://doi.org/10.1016/j.paid.2020.109922
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382.
Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7(3), 286 299. https://doi.org/10.1037/1040-3590.7.3.286
Golino, H. F., & Christensen, A. P. (2020). EGAnet: Exploratory Graph Analysis -- A framework for estimating the number of dimensions in multivariate data using network psychometrics. Retrieved from https://CRAN.R-project.org/package=EGAnet
Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLOS ONE, 12(6), 1 26. https://doi.org/10.1371/journal.pone.0174035
Golino, H. F., Moulder, R., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., … Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research, 1–29. https://doi.org/10.1080/00273171.2020.1779642
Golino, H. F., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., … Martinez-Molina, A. (2020). Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25(3), 292–320. https://doi.org/10.1037/met0000255
Goretzko, D., & Bühner, M. (2020). One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis. Psychological Methods, 25(6), 776–786. https://doi.org/10.1037/met0000262
Gorsuch, R. L. (1974). Factor analysis. W. B. Saunders.
Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns. Psychological Bulletin, 103(2), 265–275.
Guess, A., Munger, K., Nagler, J., & Tucker, J. (2019). How accurate are survey responses on social media and politics?. Political Communication, 36(2), 241-258.
Güre, Ö. B., Kayri, M., & Erdoğan, F. (2020). Analysis of factors effecting PISA 2015 mathematics literacy via educational machine learning. Education and Science, 45(202), 393-415.
Grimmer, J. (2015). We are all social scientists now: How big data, machine learning, and causal inference work together. PS, Political Science & Politics, 48(1), 80.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Peter, R., & Witten, I. H. (2009). The WEKA machine learning software: An update. SIGKDD Explorations, 11(1), 10-18.
Hamalainen, W., & Vinni, M. (2006). Comparison of machine learning methods for intelligent tutoring systems. In Proceedings of International Conference on Intelligent Tutoring Systems (pp. 525-534). Springer Berlin/Heidelberg.
Han, J., J. Pei, & Kamber, M. (2011). Machine learning: Concepts and techniques. Elsevier.
Hartmann, D. P. (1977). Considerations in the choice of interobserver reliability estimates. Journal of Applied Behavior Analysis, 10(1), 1311156. https://doi.org/10.1901/jaba.1977.10-103
Hegde, J., & Rokseth, B. (2020). Applications of machine learning methods for engineering risk assessment–A review. Safety Science, 122, 104492.
Heydari, S. S., & Mountrakis, G. (2018). Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites. Remote Sensing of Environment, 204, 648-658.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. https://doi.org/10.1007/BF02289447
Howard, M. C. (2016). A review of exploratory factor analysis decisions and overview of current practices: What we are doing and how can we improve? International Journal of Human Computer Interaction, 32(1), 51 62. https://doi.org/10.1080/10447318.2015.1087664
Iantovics, L. B., Rotar, C., & Morar, F. (2019). Survey on establishing the optimal number of factors in exploratory factor analysis applied to machine learning. Wiley Interdisciplinary Reviews: Machine learning and Knowledge Discovery, 9(2), 1 20. https://doi.org/10.1002/widm.1294
Ibarguren, I., Pérez, J. M., Muguerza, J., Gurrutxaga, I., & Arbelaitz, O. (2015). Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems, 79, 51-67. https://doi.org/10.1016/j.knosys.2014.12.023
John, G. H., & Langley P. (1995). Estimating continuous distributions in Bayesian classifiers. In P. Besnard & S. Hanks (Eds.), Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345). San Francisco, Morgan Kaufmann.
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: trends, perspectives, and prospects. Science, 349(6245), 255-260, https://doi.org/10.1126/science.aaa8415
Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20(1), 141 151. https://doi.org/10.1177/001316446002000116
Kılıç, A. F., & Koyuncu, İ. (2017). Ölçek uyarlama çalışmalarının yapı geçerliği açısından incelenmesi [Examination of scale adaptation studies in terms of construct validity]. In Ö. Demirel & S. Dinçer (Eds.), Küreselleşen dünyada eğitim [Education in a globalizing world] (pp. 1202–1205). Pegem Akademi.
Kjellström, S., & Golino, H. (2019). Mining concepts of health responsibility using text mining and exploratory graph analysis. Scandinavian Journal of Occupational Therapy, 26(6), 395–410. https://doi.org/10.1080/11038128.2018.1455896
Kline, P. (1994). An easy guide to factor analysis. Routledge.
Koyuncu, İ., & Gelbal, S. (2020). Comparison of machine learning classification algorithms on educational data under different conditions. Journal of Measurement and Evaluation in Education and Psychology, 11(4), 325-345.
Koyuncu, İ., & Kılıç, A. F. (2019). The use of exploratory and confirmatory factor analyses: A document analysis. Education and Science, 44(198), 361 388. https://doi.org/10.15390/EB.2019.7665
Kuhn, M. (2020). caret: Classification and Regression Training. Retrieved from https://cran.r-project.org/package=caret
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1-11.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174.
Landwehr, N., Hall, M., & Frank, E. (2006). Logistic model trees. Kluwer Academic Publishers.
Larose, D. T., & Larose, C.D. (2014). Discovering knowledge in data: An introduction to machine learning. John Wiley and Sons.
Li, C.-H. (2016a). Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behavior Research Methods, 48(3), 936–949. https://doi.org/10.3758/s13428-015-0619-7
Li, C.-H. (2016b). The performance of ML, DWLS, and ULS estimation with robust corrections in structural equation models with ordinal variables. Psychological Methods, 21(3), 369 387. https://doi.org/10.1037/met0000093
Li, N., Qi, J., Wang, P., Zhang, X., Zhang, T., & Li, H. (2019). Quantitative structure–activity relationship (QSAR) study of carcinogenicity of polycyclic aromatic hydrocarbons (PAHs) in atmospheric particulate matter by random forest (RF). Analytical Methods, 11(13), 1816-1821.
Mele, M., & Magazzino, C. (2020). A machine learning analysis of the relationship among iron and steel industries, air pollution, and economic growth in China. Journal of Cleaner Production, 277, 123293.
Minaei-Bidgoli, B., D.A. Kashy, G. Kortemeyer, & W. Punch (2003). Predicting student performance: An application of machine learning methods with an educational web-based system. In Proceedings of 33rd Frontiers in Education Conference, (pp. 13-18). Westminster, CO.
Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.
Nghe, N. T., Janecek, P., & Haddawy, P. (2007). A comparative analysis of techniques for predicting academic performance. In Frontiers in Education Conference-Global Engineering: Knowledge Without Borders, Opportunities Without Passports, (pp. T2G-7). IEEE.
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd. ed.). McGraw-Hill.
Osborne, J. W. (2015). What is rotating in exploratory factor analysis? Practical Assessment Research & Evaluation, 20(2), 1–7.
Panayiotou, M., Santos, J., Black, L., & Humphrey, N. (2020). Exploring the dimensionality of the social skills improvement system using exploratory graph analysis and bifactor-(S−1) modeling. Assessment, 1-15. https://doi.org/10.1177/1073191120971351
Pérez, J. M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., & Martín, J. I. (2007). Combining multiple class distribution modified subsamples in a single tree. Pattern Recognition Letters, 28(4), 414-422. https://doi.org/10.1016/j.patrec.2006.08.013
Pu, Y., Apel, D. B., & Hall, R. (2020). Using machine learning approach for microseismic events recognition in underground excavations: Comparison of ten frequently-used models. Engineering Geology, 268, 105519.
Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers, Inc.
R Core Team. (2020). R: A Language and Environment for Statistical Computing. Vienna, Austria. Retrieved from https://www.r-project.org/
Reich, Y., & Barai, S. V. (1999). Evaluating machine learning models for engineering problems. Artificial Intelligence in Engineering, 13(3), 257-272.
Rijsbergen CV. (1979). Information retrieval (2nd ed.). Butterworth.
Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2013). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 21(1), 135- 146.
Romero, C., & Ventura, S. (2013). Machine learning in education. WIREs Machine learning Knowledge Discovery 3(1), 12- 27.
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36.
Shao, L., Fan, X., Cheng, N., Wu, L., & Cheng, Y. (2013). Determination of minimum training sample size for microarray-based cancer outcome prediction–an empirical assessment. PloS one, 8(7), e68579. https://doi.org/10.1371/journal.pone.0068579
Sumner, M., Frank, E., & Hall, M. (2005, October). Speeding up logistic model tree induction. In European conference on principles of machine learning and knowledge discovery (pp. 675-683). Springer, Berlin, Heidelberg.
Sun, Y., Kamel, M. S., & Wang, Y. (2006). Boosting for learning multiple classes with imbalanced class distribution. In Sixth international conference on data mining (ICDM'06) (pp. 592-602). IEEE.
Tabachnik, B. G., & Fidell, L. S. (2012). Using multivariate statistics (6th ed.). Pearson.
Tezbaşaran, E., & Gelbal, S. (2018). Temel bileşenler analizi ve yapay sinir ağı modellerinin ölçek geliştirme sürecinde kullanılabilirliğinin incelenmesi [An investigation on usability of principal component analysis and artificial neural network models in the process of scale development]. Mersin University Journal of the Faculty of Education, 14(1), 225-252.
Timmerman, M. E., & Lorenzo-Seva, U. (2011). Dimensionality assessment of ordered polytomous items with parallel analysis. Psychological Methods, 16(2), 209 220. https://doi.org/10.1037/a0023353
West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues and applications (pp. 56-75). Sage.
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2017). Machine learning: Practical machine learning tools and techniques (4th Edition). Morgan Kaufmann.
Wojna, A., Latkowski, R. (2018): Rseslib 3: Open source library of rough set and machine learning methods. In: Proceedings of the International Joint Conference on Rough Set (LNCS, vol. 11103, pp. 162-176). Springer.
Wojna, A., Latkowski, R., Kowalski, (2019). RSESLIB: User guide. Retrieved from http://rseslib.mimuw.edu.pl/rseslib.pdf
Zhang, F., & Yang, X. (2020). Improving land cover classification in an urbanized coastal area by random forests: The role of variable selection. Remote Sensing of Environment, 251, 112105. https://doi.org/10.1016/j.rse.2020.112105

Toplam 99 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Eğitim Üzerine Çalışmalar
Bölüm	Makaleler
Yazarlar	İlhan Koyuncu 0000-0002-0009-5279 Abdullah Faruk Kılıç 0000-0003-3129-1763
Yayımlanma Tarihi	4 Aralık 2021
Gönderilme Tarihi	15 Şubat 2021
Yayımlandığı Sayı	Yıl 2021 Cilt: 8 Sayı: 4

Kaynak Göster

APA	Koyuncu, İ., & Kılıç, A. F. (2021). Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods. International Journal of Assessment Tools in Education, 8(4), 928-947. https://doi.org/10.21449/ijate.880914

Cited By

Açıklayıcı Grafik Analizi: EGAnet R paketiyle Bir Uygulama

Çankırı Karatekin Üniversitesi Sosyal Bilimler Enstitüsü Dergisi

https://doi.org/10.54558/jiss.1449101

Using exploratory graph analysis (EGA) in validating the structure of the Perth alexithymia questionnaire in Iranians with chronic pain

Frontiers in Psychology

https://doi.org/10.3389/fpsyg.2024.1400340

Deciding The Number Of Dimensions In Explanatory Factor Analysis: A Brief Overview Of The Methods

Pamukkale University Journal of Social Sciences Institute

https://doi.org/10.30794/pausbed.1095936

Makale Dosyaları

Tam Metin

23823 23825 23824