Araştırma Makalesi
BibTex RIS Kaynak Göster

Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods

Yıl 2021, Cilt: 8 Sayı: 4, 928 - 947, 04.12.2021
https://doi.org/10.21449/ijate.880914

Öz

In exploratory factor analysis, although the researchers decide which items belong to which factors by considering statistical results, the decisions taken sometimes can be subjective in case of having items with similar factor loadings and complex factor structures. The aim of this study was to examine the validity of classifying items into dimensions with exploratory graph analysis (EGA), which has been used in determining the number of dimensions in recent years and machine learning methods. A Monte Carlo simulation was performed with a total number of 96 simulation conditions including average factor loadings, sample size, number of items per dimension, number of dimensions, and distribution of data. Percent correct and Kappa concordance values were used in the evaluation of the methods. When the findings obtained for different conditions were evaluated together, it was seen that the machine learning methods gave results comparable to those of EGA. Machine learning methods showed high performance in terms of percent correct values, especially in small and medium-sized samples. In all conditions where the average factor loading was .70, BayesNet, Naive Bayes, RandomForest, and RseslibKnn methods showed accurate classification performances above 80% like EGA method. BayesNet, Simple Logistic and RBFNetwork methods also demonstrated acceptable or high performance under many conditions. In general, Kappa concordance values also supported these results. The results revealed that machine learning methods can be used for similar conditions to examine whether the distribution of items across factors is done accurately or not.

Kaynakça

  • Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning 6, 37-66.
  • Akpınar, H. (2014). Veri madenciliği veri analizi [Data mining data analysis]. Papatya Yayınları.
  • Alpaydin, E. (2010). Introduction to machine learning: Adaptive computation and machine learning series. MIT Press.
  • Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological Methods, 24(4), 468–491. https://doi:10.1037/met0000200
  • Azqueta-Gavaldón, A. (2017). Developing news-based economic policy uncertainty index with unsupervised machine learning. Economics Letters, 158, 47-50.
  • Baker, R. S. J. (2010). Machine learning for education. International Encyclopedia of Education, 7(3), 112-118.
  • Baldi, P., & Hornik, K. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2, 53-58.
  • Bandalos, D. L., & Leite, W. (2013). Use of Monte Carlo studies in structural equation modeling research. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed.). Information Age.
  • Barker, K., Trafalis, T., & Rhoads, T. R. (2004). Learning from student data. In Proceedings of the 2004 Systems and Information Engineering Design Symposium (pp. 79-86). IEEE.
  • Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186 203. https://doi.org/10.1207/s15328007sem1302_2
  • Beleites, C., Neugebauer, U., Bocklitz, T., Krafft, C., & Popp, J. (2013). Sample size planning for classification models. Analytica Chimica Acta, 760, 25-33.
  • Belvederi Murri, M., Caruso, R., Ounalli, H., Zerbinati, L., Berretti, E., Costa, S., … Grassi, L. (2020). The relationship between demoralization and depressive symptoms among patients from the general hospital: network and exploratory graph analysis: Demoralization and depression symptom network. Journal of Affective Disorders, 276(June), 137–146. https://doi.org/10.1016/j.jad.2020.06.074
  • Berens, J., Schneider, K., Gortz, S., Oster, S., & Burghoff, J. (2019). Early detection of students at risk - predicting student dropouts using administrative student data from German universities and machine learning methods. Journal of Educational Machine learning, 11(3), 1-41. https://doi.org/10.5281/zenodo.3594771
  • Bouckaert, R. R. (2008). Bayesian network classifiers in Weka for Version 3-5-7. Artificial Intelligence Tools, 11(3), 369-387.
  • Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2020). WEKA manual for version 3-9-5. University of Waikato.
  • Brain, D., & Webb, G. (1999). On the effect of data set size on bias and variance in classification learning. In Proceedings of the Fourth Australian Knowledge Acquisition Workshop, University of New South Wales (pp. 117-128), December 5-6, Sydney, Australia.
  • Branco, P., Torgo, L., & Ribeiro, R. (2015). A survey of predictive modelling under imbalanced distributions. arXiv preprint arXiv:1505.01658.
  • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). The Guilford.
  • Bulut, O., & Yavuz, H. C. (2019). Educational machine learning: A tutorial for the" Rattle" package in R. International Journal of Assessment Tools in Education, 6(5), 20-36.
  • Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245–276. https://doi.org/10.1207/s15327906mbr0102_10
  • Chattopadhyay, M., Dan, P. K., & Mazumdar, S. (2011). Principal component analysis and self-organizing map for visual clustering of machine-part cell formation in cellular manufacturing system. In Systems Research Forum (Vol. 5, No. 01, pp. 25-51). World Scientific Publishing Company.
  • Chou, C. P., & Bentler, P. M. (1995). Estimates and tests in structural equation modeling. In Rich H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications. Sage.
  • Chu, C., Hsu, A. L., Chou, K. H., Bandettini, P., Lin, C., & Alzheimer's Disease Neuroimaging Initiative (2012). Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage, 60(1), 59-70.
  • Cleary, J. G., & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. In Machine Learning Proceedings 1995 (pp. 108-114). Morgan Kaufmann.
  • Cohen, J. (1960). A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20(1), 37-46.
  • Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 27–29. https://doi.org/10.1.1.110.9154
  • Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16-29. https://doi.org/10.1037/1082-989X.1.1.16
  • Efron, B. (1983). Estimating the error rate of a prediction rule: Improvements on crossvalidation. J. Amer. Stat. Ass., 78, 316–331.
  • Egan, J. P. (1975). Signal detection theory and ROC analysis. Academic Press.
  • Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272–299. https://doi.org/10.1037/1082-989X.4.3.272
  • Figueroa, R. L., Zeng-Treitler, Q., Kandula, S., & Ngo, L. H. (2012). Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making, 12(1), 8.
  • Finney, S. J., & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed., pp. 439–492). Charlotte, NC: IAP.
  • Fischer, R., & Alfons Karl, J. (2020). The network architecture of individual differences: Personality, reward-sensitivity, and values. Personality and Individual Differences, 160(February), 109922. https://doi.org/10.1016/j.paid.2020.109922
  • Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382.
  • Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7(3), 286 299. https://doi.org/10.1037/1040-3590.7.3.286
  • Golino, H. F., & Christensen, A. P. (2020). EGAnet: Exploratory Graph Analysis -- A framework for estimating the number of dimensions in multivariate data using network psychometrics. Retrieved from https://CRAN.R-project.org/package=EGAnet
  • Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLOS ONE, 12(6), 1 26. https://doi.org/10.1371/journal.pone.0174035
  • Golino, H. F., Moulder, R., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., … Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research, 1–29. https://doi.org/10.1080/00273171.2020.1779642
  • Golino, H. F., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., … Martinez-Molina, A. (2020). Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25(3), 292–320. https://doi.org/10.1037/met0000255
  • Goretzko, D., & Bühner, M. (2020). One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis. Psychological Methods, 25(6), 776–786. https://doi.org/10.1037/met0000262
  • Gorsuch, R. L. (1974). Factor analysis. W. B. Saunders.
  • Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns. Psychological Bulletin, 103(2), 265–275.
  • Guess, A., Munger, K., Nagler, J., & Tucker, J. (2019). How accurate are survey responses on social media and politics?. Political Communication, 36(2), 241-258.
  • Güre, Ö. B., Kayri, M., & Erdoğan, F. (2020). Analysis of factors effecting PISA 2015 mathematics literacy via educational machine learning. Education and Science, 45(202), 393-415.
  • Grimmer, J. (2015). We are all social scientists now: How big data, machine learning, and causal inference work together. PS, Political Science & Politics, 48(1), 80.
  • Hall, M., Frank, E., Holmes, G., Pfahringer, B., Peter, R., & Witten, I. H. (2009). The WEKA machine learning software: An update. SIGKDD Explorations, 11(1), 10-18.
  • Hamalainen, W., & Vinni, M. (2006). Comparison of machine learning methods for intelligent tutoring systems. In Proceedings of International Conference on Intelligent Tutoring Systems (pp. 525-534). Springer Berlin/Heidelberg.
  • Han, J., J. Pei, & Kamber, M. (2011). Machine learning: Concepts and techniques. Elsevier.
  • Hartmann, D. P. (1977). Considerations in the choice of interobserver reliability estimates. Journal of Applied Behavior Analysis, 10(1), 1311156. https://doi.org/10.1901/jaba.1977.10-103
  • Hegde, J., & Rokseth, B. (2020). Applications of machine learning methods for engineering risk assessment–A review. Safety Science, 122, 104492.
  • Heydari, S. S., & Mountrakis, G. (2018). Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites. Remote Sensing of Environment, 204, 648-658.
  • Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. https://doi.org/10.1007/BF02289447
  • Howard, M. C. (2016). A review of exploratory factor analysis decisions and overview of current practices: What we are doing and how can we improve? International Journal of Human Computer Interaction, 32(1), 51 62. https://doi.org/10.1080/10447318.2015.1087664
  • Iantovics, L. B., Rotar, C., & Morar, F. (2019). Survey on establishing the optimal number of factors in exploratory factor analysis applied to machine learning. Wiley Interdisciplinary Reviews: Machine learning and Knowledge Discovery, 9(2), 1 20. https://doi.org/10.1002/widm.1294
  • Ibarguren, I., Pérez, J. M., Muguerza, J., Gurrutxaga, I., & Arbelaitz, O. (2015). Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems, 79, 51-67. https://doi.org/10.1016/j.knosys.2014.12.023
  • John, G. H., & Langley P. (1995). Estimating continuous distributions in Bayesian classifiers. In P. Besnard & S. Hanks (Eds.), Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345). San Francisco, Morgan Kaufmann.
  • Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: trends, perspectives, and prospects. Science, 349(6245), 255-260, https://doi.org/10.1126/science.aaa8415
  • Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20(1), 141 151. https://doi.org/10.1177/001316446002000116
  • Kılıç, A. F., & Koyuncu, İ. (2017). Ölçek uyarlama çalışmalarının yapı geçerliği açısından incelenmesi [Examination of scale adaptation studies in terms of construct validity]. In Ö. Demirel & S. Dinçer (Eds.), Küreselleşen dünyada eğitim [Education in a globalizing world] (pp. 1202–1205). Pegem Akademi.
  • Kjellström, S., & Golino, H. (2019). Mining concepts of health responsibility using text mining and exploratory graph analysis. Scandinavian Journal of Occupational Therapy, 26(6), 395–410. https://doi.org/10.1080/11038128.2018.1455896
  • Kline, P. (1994). An easy guide to factor analysis. Routledge.
  • Koyuncu, İ., & Gelbal, S. (2020). Comparison of machine learning classification algorithms on educational data under different conditions. Journal of Measurement and Evaluation in Education and Psychology, 11(4), 325-345.
  • Koyuncu, İ., & Kılıç, A. F. (2019). The use of exploratory and confirmatory factor analyses: A document analysis. Education and Science, 44(198), 361 388. https://doi.org/10.15390/EB.2019.7665
  • Kuhn, M. (2020). caret: Classification and Regression Training. Retrieved from https://cran.r-project.org/package=caret
  • Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1-11.
  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174.
  • Landwehr, N., Hall, M., & Frank, E. (2006). Logistic model trees. Kluwer Academic Publishers.
  • Larose, D. T., & Larose, C.D. (2014). Discovering knowledge in data: An introduction to machine learning. John Wiley and Sons.
  • Li, C.-H. (2016a). Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behavior Research Methods, 48(3), 936–949. https://doi.org/10.3758/s13428-015-0619-7
  • Li, C.-H. (2016b). The performance of ML, DWLS, and ULS estimation with robust corrections in structural equation models with ordinal variables. Psychological Methods, 21(3), 369 387. https://doi.org/10.1037/met0000093
  • Li, N., Qi, J., Wang, P., Zhang, X., Zhang, T., & Li, H. (2019). Quantitative structure–activity relationship (QSAR) study of carcinogenicity of polycyclic aromatic hydrocarbons (PAHs) in atmospheric particulate matter by random forest (RF). Analytical Methods, 11(13), 1816-1821.
  • Mele, M., & Magazzino, C. (2020). A machine learning analysis of the relationship among iron and steel industries, air pollution, and economic growth in China. Journal of Cleaner Production, 277, 123293.
  • Minaei-Bidgoli, B., D.A. Kashy, G. Kortemeyer, & W. Punch (2003). Predicting student performance: An application of machine learning methods with an educational web-based system. In Proceedings of 33rd Frontiers in Education Conference, (pp. 13-18). Westminster, CO.
  • Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.
  • Nghe, N. T., Janecek, P., & Haddawy, P. (2007). A comparative analysis of techniques for predicting academic performance. In Frontiers in Education Conference-Global Engineering: Knowledge Without Borders, Opportunities Without Passports, (pp. T2G-7). IEEE.
  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd. ed.). McGraw-Hill.
  • Osborne, J. W. (2015). What is rotating in exploratory factor analysis? Practical Assessment Research & Evaluation, 20(2), 1–7.
  • Panayiotou, M., Santos, J., Black, L., & Humphrey, N. (2020). Exploring the dimensionality of the social skills improvement system using exploratory graph analysis and bifactor-(S−1) modeling. Assessment, 1-15. https://doi.org/10.1177/1073191120971351
  • Pérez, J. M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., & Martín, J. I. (2007). Combining multiple class distribution modified subsamples in a single tree. Pattern Recognition Letters, 28(4), 414-422. https://doi.org/10.1016/j.patrec.2006.08.013
  • Pu, Y., Apel, D. B., & Hall, R. (2020). Using machine learning approach for microseismic events recognition in underground excavations: Comparison of ten frequently-used models. Engineering Geology, 268, 105519.
  • Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers, Inc.
  • R Core Team. (2020). R: A Language and Environment for Statistical Computing. Vienna, Austria. Retrieved from https://www.r-project.org/
  • Reich, Y., & Barai, S. V. (1999). Evaluating machine learning models for engineering problems. Artificial Intelligence in Engineering, 13(3), 257-272.
  • Rijsbergen CV. (1979). Information retrieval (2nd ed.). Butterworth.
  • Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2013). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 21(1), 135- 146.
  • Romero, C., & Ventura, S. (2013). Machine learning in education. WIREs Machine learning Knowledge Discovery 3(1), 12- 27.
  • Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36.
  • Shao, L., Fan, X., Cheng, N., Wu, L., & Cheng, Y. (2013). Determination of minimum training sample size for microarray-based cancer outcome prediction–an empirical assessment. PloS one, 8(7), e68579. https://doi.org/10.1371/journal.pone.0068579
  • Sumner, M., Frank, E., & Hall, M. (2005, October). Speeding up logistic model tree induction. In European conference on principles of machine learning and knowledge discovery (pp. 675-683). Springer, Berlin, Heidelberg.
  • Sun, Y., Kamel, M. S., & Wang, Y. (2006). Boosting for learning multiple classes with imbalanced class distribution. In Sixth international conference on data mining (ICDM'06) (pp. 592-602). IEEE.
  • Tabachnik, B. G., & Fidell, L. S. (2012). Using multivariate statistics (6th ed.). Pearson.
  • Tezbaşaran, E., & Gelbal, S. (2018). Temel bileşenler analizi ve yapay sinir ağı modellerinin ölçek geliştirme sürecinde kullanılabilirliğinin incelenmesi [An investigation on usability of principal component analysis and artificial neural network models in the process of scale development]. Mersin University Journal of the Faculty of Education, 14(1), 225-252.
  • Timmerman, M. E., & Lorenzo-Seva, U. (2011). Dimensionality assessment of ordered polytomous items with parallel analysis. Psychological Methods, 16(2), 209 220. https://doi.org/10.1037/a0023353
  • West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues and applications (pp. 56-75). Sage.
  • Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2017). Machine learning: Practical machine learning tools and techniques (4th Edition). Morgan Kaufmann.
  • Wojna, A., Latkowski, R. (2018): Rseslib 3: Open source library of rough set and machine learning methods. In: Proceedings of the International Joint Conference on Rough Set (LNCS, vol. 11103, pp. 162-176). Springer.
  • Wojna, A., Latkowski, R., Kowalski, (2019). RSESLIB: User guide. Retrieved from http://rseslib.mimuw.edu.pl/rseslib.pdf
  • Zhang, F., & Yang, X. (2020). Improving land cover classification in an urbanized coastal area by random forests: The role of variable selection. Remote Sensing of Environment, 251, 112105. https://doi.org/10.1016/j.rse.2020.112105

Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods

Yıl 2021, Cilt: 8 Sayı: 4, 928 - 947, 04.12.2021
https://doi.org/10.21449/ijate.880914

Öz

In exploratory factor analysis, although the researchers decide which items belong to which factors by considering statistical results, the decisions taken sometimes can be subjective in case of having items with similar factor loadings and complex factor structures. The aim of this study was to examine the validity of classifying items into dimensions with exploratory graph analysis (EGA), which has been used in determining the number of dimensions in recent years and machine learning methods. A Monte Carlo simulation was performed with a total number of 96 simulation conditions including average factor loadings, sample size, number of items per dimension, number of dimensions, and distribution of data. Percent correct and Kappa concordance values were used in the evaluation of the methods. When the findings obtained for different conditions were evaluated together, it was seen that the machine learning methods gave results comparable to those of EGA. Machine learning methods showed high performance in terms of percent correct values, especially in small and medium-sized samples. In all conditions where the average factor loading was .70, BayesNet, Naive Bayes, RandomForest, and RseslibKnn methods showed accurate classification performances above 80% like EGA method. BayesNet, Simple Logistic and RBFNetwork methods also demonstrated acceptable or high performance under many conditions. In general, Kappa concordance values also supported these results. The results revealed that machine learning methods can be used for similar conditions to examine whether the distribution of items across factors is done accurately or not.

Kaynakça

  • Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning 6, 37-66.
  • Akpınar, H. (2014). Veri madenciliği veri analizi [Data mining data analysis]. Papatya Yayınları.
  • Alpaydin, E. (2010). Introduction to machine learning: Adaptive computation and machine learning series. MIT Press.
  • Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological Methods, 24(4), 468–491. https://doi:10.1037/met0000200
  • Azqueta-Gavaldón, A. (2017). Developing news-based economic policy uncertainty index with unsupervised machine learning. Economics Letters, 158, 47-50.
  • Baker, R. S. J. (2010). Machine learning for education. International Encyclopedia of Education, 7(3), 112-118.
  • Baldi, P., & Hornik, K. (1989). Neural networks and principal component analysis: Learning from examples without local minima. Neural Networks, 2, 53-58.
  • Bandalos, D. L., & Leite, W. (2013). Use of Monte Carlo studies in structural equation modeling research. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed.). Information Age.
  • Barker, K., Trafalis, T., & Rhoads, T. R. (2004). Learning from student data. In Proceedings of the 2004 Systems and Information Engineering Design Symposium (pp. 79-86). IEEE.
  • Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted weighted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186 203. https://doi.org/10.1207/s15328007sem1302_2
  • Beleites, C., Neugebauer, U., Bocklitz, T., Krafft, C., & Popp, J. (2013). Sample size planning for classification models. Analytica Chimica Acta, 760, 25-33.
  • Belvederi Murri, M., Caruso, R., Ounalli, H., Zerbinati, L., Berretti, E., Costa, S., … Grassi, L. (2020). The relationship between demoralization and depressive symptoms among patients from the general hospital: network and exploratory graph analysis: Demoralization and depression symptom network. Journal of Affective Disorders, 276(June), 137–146. https://doi.org/10.1016/j.jad.2020.06.074
  • Berens, J., Schneider, K., Gortz, S., Oster, S., & Burghoff, J. (2019). Early detection of students at risk - predicting student dropouts using administrative student data from German universities and machine learning methods. Journal of Educational Machine learning, 11(3), 1-41. https://doi.org/10.5281/zenodo.3594771
  • Bouckaert, R. R. (2008). Bayesian network classifiers in Weka for Version 3-5-7. Artificial Intelligence Tools, 11(3), 369-387.
  • Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2020). WEKA manual for version 3-9-5. University of Waikato.
  • Brain, D., & Webb, G. (1999). On the effect of data set size on bias and variance in classification learning. In Proceedings of the Fourth Australian Knowledge Acquisition Workshop, University of New South Wales (pp. 117-128), December 5-6, Sydney, Australia.
  • Branco, P., Torgo, L., & Ribeiro, R. (2015). A survey of predictive modelling under imbalanced distributions. arXiv preprint arXiv:1505.01658.
  • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • Brown, T. A. (2015). Confirmatory factor analysis for applied research (2nd ed.). The Guilford.
  • Bulut, O., & Yavuz, H. C. (2019). Educational machine learning: A tutorial for the" Rattle" package in R. International Journal of Assessment Tools in Education, 6(5), 20-36.
  • Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1(2), 245–276. https://doi.org/10.1207/s15327906mbr0102_10
  • Chattopadhyay, M., Dan, P. K., & Mazumdar, S. (2011). Principal component analysis and self-organizing map for visual clustering of machine-part cell formation in cellular manufacturing system. In Systems Research Forum (Vol. 5, No. 01, pp. 25-51). World Scientific Publishing Company.
  • Chou, C. P., & Bentler, P. M. (1995). Estimates and tests in structural equation modeling. In Rich H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications. Sage.
  • Chu, C., Hsu, A. L., Chou, K. H., Bandettini, P., Lin, C., & Alzheimer's Disease Neuroimaging Initiative (2012). Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage, 60(1), 59-70.
  • Cleary, J. G., & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. In Machine Learning Proceedings 1995 (pp. 108-114). Morgan Kaufmann.
  • Cohen, J. (1960). A coefficient of agreement for nominal scales, Educational and Psychological Measurement, 20(1), 37-46.
  • Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 27–29. https://doi.org/10.1.1.110.9154
  • Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16-29. https://doi.org/10.1037/1082-989X.1.1.16
  • Efron, B. (1983). Estimating the error rate of a prediction rule: Improvements on crossvalidation. J. Amer. Stat. Ass., 78, 316–331.
  • Egan, J. P. (1975). Signal detection theory and ROC analysis. Academic Press.
  • Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272–299. https://doi.org/10.1037/1082-989X.4.3.272
  • Figueroa, R. L., Zeng-Treitler, Q., Kandula, S., & Ngo, L. H. (2012). Predicting sample size required for classification performance. BMC Medical Informatics and Decision Making, 12(1), 8.
  • Finney, S. J., & DiStefano, C. (2013). Nonnormal and categorical data in structural equation modeling. In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A second course (2nd ed., pp. 439–492). Charlotte, NC: IAP.
  • Fischer, R., & Alfons Karl, J. (2020). The network architecture of individual differences: Personality, reward-sensitivity, and values. Personality and Individual Differences, 160(February), 109922. https://doi.org/10.1016/j.paid.2020.109922
  • Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382.
  • Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7(3), 286 299. https://doi.org/10.1037/1040-3590.7.3.286
  • Golino, H. F., & Christensen, A. P. (2020). EGAnet: Exploratory Graph Analysis -- A framework for estimating the number of dimensions in multivariate data using network psychometrics. Retrieved from https://CRAN.R-project.org/package=EGAnet
  • Golino, H. F., & Epskamp, S. (2017). Exploratory graph analysis: A new approach for estimating the number of dimensions in psychological research. PLOS ONE, 12(6), 1 26. https://doi.org/10.1371/journal.pone.0174035
  • Golino, H. F., Moulder, R., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., … Boker, S. M. (2020). Entropy fit indices: New fit measures for assessing the structure and dimensionality of multiple latent variables. Multivariate Behavioral Research, 1–29. https://doi.org/10.1080/00273171.2020.1779642
  • Golino, H. F., Shi, D., Christensen, A. P., Garrido, L. E., Nieto, M. D., Sadana, R., … Martinez-Molina, A. (2020). Investigating the performance of exploratory graph analysis and traditional techniques to identify the number of latent factors: A simulation and tutorial. Psychological Methods, 25(3), 292–320. https://doi.org/10.1037/met0000255
  • Goretzko, D., & Bühner, M. (2020). One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis. Psychological Methods, 25(6), 776–786. https://doi.org/10.1037/met0000262
  • Gorsuch, R. L. (1974). Factor analysis. W. B. Saunders.
  • Guadagnoli, E., & Velicer, W. F. (1988). Relation of sample size to the stability of component patterns. Psychological Bulletin, 103(2), 265–275.
  • Guess, A., Munger, K., Nagler, J., & Tucker, J. (2019). How accurate are survey responses on social media and politics?. Political Communication, 36(2), 241-258.
  • Güre, Ö. B., Kayri, M., & Erdoğan, F. (2020). Analysis of factors effecting PISA 2015 mathematics literacy via educational machine learning. Education and Science, 45(202), 393-415.
  • Grimmer, J. (2015). We are all social scientists now: How big data, machine learning, and causal inference work together. PS, Political Science & Politics, 48(1), 80.
  • Hall, M., Frank, E., Holmes, G., Pfahringer, B., Peter, R., & Witten, I. H. (2009). The WEKA machine learning software: An update. SIGKDD Explorations, 11(1), 10-18.
  • Hamalainen, W., & Vinni, M. (2006). Comparison of machine learning methods for intelligent tutoring systems. In Proceedings of International Conference on Intelligent Tutoring Systems (pp. 525-534). Springer Berlin/Heidelberg.
  • Han, J., J. Pei, & Kamber, M. (2011). Machine learning: Concepts and techniques. Elsevier.
  • Hartmann, D. P. (1977). Considerations in the choice of interobserver reliability estimates. Journal of Applied Behavior Analysis, 10(1), 1311156. https://doi.org/10.1901/jaba.1977.10-103
  • Hegde, J., & Rokseth, B. (2020). Applications of machine learning methods for engineering risk assessment–A review. Safety Science, 122, 104492.
  • Heydari, S. S., & Mountrakis, G. (2018). Effect of classifier selection, reference sample size, reference class distribution and scene heterogeneity in per-pixel classification accuracy using 26 Landsat sites. Remote Sensing of Environment, 204, 648-658.
  • Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. https://doi.org/10.1007/BF02289447
  • Howard, M. C. (2016). A review of exploratory factor analysis decisions and overview of current practices: What we are doing and how can we improve? International Journal of Human Computer Interaction, 32(1), 51 62. https://doi.org/10.1080/10447318.2015.1087664
  • Iantovics, L. B., Rotar, C., & Morar, F. (2019). Survey on establishing the optimal number of factors in exploratory factor analysis applied to machine learning. Wiley Interdisciplinary Reviews: Machine learning and Knowledge Discovery, 9(2), 1 20. https://doi.org/10.1002/widm.1294
  • Ibarguren, I., Pérez, J. M., Muguerza, J., Gurrutxaga, I., & Arbelaitz, O. (2015). Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems, 79, 51-67. https://doi.org/10.1016/j.knosys.2014.12.023
  • John, G. H., & Langley P. (1995). Estimating continuous distributions in Bayesian classifiers. In P. Besnard & S. Hanks (Eds.), Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345). San Francisco, Morgan Kaufmann.
  • Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: trends, perspectives, and prospects. Science, 349(6245), 255-260, https://doi.org/10.1126/science.aaa8415
  • Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20(1), 141 151. https://doi.org/10.1177/001316446002000116
  • Kılıç, A. F., & Koyuncu, İ. (2017). Ölçek uyarlama çalışmalarının yapı geçerliği açısından incelenmesi [Examination of scale adaptation studies in terms of construct validity]. In Ö. Demirel & S. Dinçer (Eds.), Küreselleşen dünyada eğitim [Education in a globalizing world] (pp. 1202–1205). Pegem Akademi.
  • Kjellström, S., & Golino, H. (2019). Mining concepts of health responsibility using text mining and exploratory graph analysis. Scandinavian Journal of Occupational Therapy, 26(6), 395–410. https://doi.org/10.1080/11038128.2018.1455896
  • Kline, P. (1994). An easy guide to factor analysis. Routledge.
  • Koyuncu, İ., & Gelbal, S. (2020). Comparison of machine learning classification algorithms on educational data under different conditions. Journal of Measurement and Evaluation in Education and Psychology, 11(4), 325-345.
  • Koyuncu, İ., & Kılıç, A. F. (2019). The use of exploratory and confirmatory factor analyses: A document analysis. Education and Science, 44(198), 361 388. https://doi.org/10.15390/EB.2019.7665
  • Kuhn, M. (2020). caret: Classification and Regression Training. Retrieved from https://cran.r-project.org/package=caret
  • Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1-11.
  • Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174.
  • Landwehr, N., Hall, M., & Frank, E. (2006). Logistic model trees. Kluwer Academic Publishers.
  • Larose, D. T., & Larose, C.D. (2014). Discovering knowledge in data: An introduction to machine learning. John Wiley and Sons.
  • Li, C.-H. (2016a). Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behavior Research Methods, 48(3), 936–949. https://doi.org/10.3758/s13428-015-0619-7
  • Li, C.-H. (2016b). The performance of ML, DWLS, and ULS estimation with robust corrections in structural equation models with ordinal variables. Psychological Methods, 21(3), 369 387. https://doi.org/10.1037/met0000093
  • Li, N., Qi, J., Wang, P., Zhang, X., Zhang, T., & Li, H. (2019). Quantitative structure–activity relationship (QSAR) study of carcinogenicity of polycyclic aromatic hydrocarbons (PAHs) in atmospheric particulate matter by random forest (RF). Analytical Methods, 11(13), 1816-1821.
  • Mele, M., & Magazzino, C. (2020). A machine learning analysis of the relationship among iron and steel industries, air pollution, and economic growth in China. Journal of Cleaner Production, 277, 123293.
  • Minaei-Bidgoli, B., D.A. Kashy, G. Kortemeyer, & W. Punch (2003). Predicting student performance: An application of machine learning methods with an educational web-based system. In Proceedings of 33rd Frontiers in Education Conference, (pp. 13-18). Westminster, CO.
  • Mullainathan, S., & Spiess, J. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2), 87-106.
  • Nghe, N. T., Janecek, P., & Haddawy, P. (2007). A comparative analysis of techniques for predicting academic performance. In Frontiers in Education Conference-Global Engineering: Knowledge Without Borders, Opportunities Without Passports, (pp. T2G-7). IEEE.
  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd. ed.). McGraw-Hill.
  • Osborne, J. W. (2015). What is rotating in exploratory factor analysis? Practical Assessment Research & Evaluation, 20(2), 1–7.
  • Panayiotou, M., Santos, J., Black, L., & Humphrey, N. (2020). Exploring the dimensionality of the social skills improvement system using exploratory graph analysis and bifactor-(S−1) modeling. Assessment, 1-15. https://doi.org/10.1177/1073191120971351
  • Pérez, J. M., Muguerza, J., Arbelaitz, O., Gurrutxaga, I., & Martín, J. I. (2007). Combining multiple class distribution modified subsamples in a single tree. Pattern Recognition Letters, 28(4), 414-422. https://doi.org/10.1016/j.patrec.2006.08.013
  • Pu, Y., Apel, D. B., & Hall, R. (2020). Using machine learning approach for microseismic events recognition in underground excavations: Comparison of ten frequently-used models. Engineering Geology, 268, 105519.
  • Quinlan, J. R. (1993). C4.5: programs for machine learning. Morgan Kaufmann Publishers, Inc.
  • R Core Team. (2020). R: A Language and Environment for Statistical Computing. Vienna, Austria. Retrieved from https://www.r-project.org/
  • Reich, Y., & Barai, S. V. (1999). Evaluating machine learning models for engineering problems. Artificial Intelligence in Engineering, 13(3), 257-272.
  • Rijsbergen CV. (1979). Information retrieval (2nd ed.). Butterworth.
  • Romero, C., Espejo, P. G., Zafra, A., Romero, J. R., & Ventura, S. (2013). Web usage mining for predicting final marks of students that use Moodle courses. Computer Applications in Engineering Education, 21(1), 135- 146.
  • Romero, C., & Ventura, S. (2013). Machine learning in education. WIREs Machine learning Knowledge Discovery 3(1), 12- 27.
  • Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36.
  • Shao, L., Fan, X., Cheng, N., Wu, L., & Cheng, Y. (2013). Determination of minimum training sample size for microarray-based cancer outcome prediction–an empirical assessment. PloS one, 8(7), e68579. https://doi.org/10.1371/journal.pone.0068579
  • Sumner, M., Frank, E., & Hall, M. (2005, October). Speeding up logistic model tree induction. In European conference on principles of machine learning and knowledge discovery (pp. 675-683). Springer, Berlin, Heidelberg.
  • Sun, Y., Kamel, M. S., & Wang, Y. (2006). Boosting for learning multiple classes with imbalanced class distribution. In Sixth international conference on data mining (ICDM'06) (pp. 592-602). IEEE.
  • Tabachnik, B. G., & Fidell, L. S. (2012). Using multivariate statistics (6th ed.). Pearson.
  • Tezbaşaran, E., & Gelbal, S. (2018). Temel bileşenler analizi ve yapay sinir ağı modellerinin ölçek geliştirme sürecinde kullanılabilirliğinin incelenmesi [An investigation on usability of principal component analysis and artificial neural network models in the process of scale development]. Mersin University Journal of the Faculty of Education, 14(1), 225-252.
  • Timmerman, M. E., & Lorenzo-Seva, U. (2011). Dimensionality assessment of ordered polytomous items with parallel analysis. Psychological Methods, 16(2), 209 220. https://doi.org/10.1037/a0023353
  • West, S. G., Finch, J. F., & Curran, P. J. (1995). Structural equation models with nonnormal variables: Problems and remedies. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues and applications (pp. 56-75). Sage.
  • Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2017). Machine learning: Practical machine learning tools and techniques (4th Edition). Morgan Kaufmann.
  • Wojna, A., Latkowski, R. (2018): Rseslib 3: Open source library of rough set and machine learning methods. In: Proceedings of the International Joint Conference on Rough Set (LNCS, vol. 11103, pp. 162-176). Springer.
  • Wojna, A., Latkowski, R., Kowalski, (2019). RSESLIB: User guide. Retrieved from http://rseslib.mimuw.edu.pl/rseslib.pdf
  • Zhang, F., & Yang, X. (2020). Improving land cover classification in an urbanized coastal area by random forests: The role of variable selection. Remote Sensing of Environment, 251, 112105. https://doi.org/10.1016/j.rse.2020.112105
Toplam 99 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Eğitim Üzerine Çalışmalar
Bölüm Makaleler
Yazarlar

İlhan Koyuncu 0000-0002-0009-5279

Abdullah Faruk Kılıç 0000-0003-3129-1763

Yayımlanma Tarihi 4 Aralık 2021
Gönderilme Tarihi 15 Şubat 2021
Yayımlandığı Sayı Yıl 2021 Cilt: 8 Sayı: 4

Kaynak Göster

APA Koyuncu, İ., & Kılıç, A. F. (2021). Classification of Scale Items with Exploratory Graph Analysis and Machine Learning Methods. International Journal of Assessment Tools in Education, 8(4), 928-947. https://doi.org/10.21449/ijate.880914

23823             23825             23824