A Comparison of Classification Performances between the Methods of Logistics Regression and CHAID Analysis in accordance with Sample Size
Abstract
The aim of the study is to analyze how classification performances change in accordance with sample size in logistic regression and CHAID analyses. The dataset used in this study was obtained by means of “Attentional Control Scale.” The scale was applied to 1824 students and the analyses were done by randomly choosing the samples from the dataset. Nine classification criteria were determined in order to evaluate classification performances of logistic regression and CHAID analyses, and the results were interpreted in consideration of these criteria. As a result of the analyses, it was found that classification performance in logistic regression showed no change as sample size increased, and performed a better classification in small sample size (N= between 25 and 900) than CHAID analysis. On the other hand, in the method of CHAID analysis it was seen that classification performance improved as sample size increased, and provided stronger findings in large sample size (N= 1000 and above). Moreover, in classification studies logistic regression analysis yielded more reliable results, and CHAID analysis provided stronger classifications. The results of this study are considered to suggest researchers to select the methods in classification studies based on sample size.
Keywords
References
- Akın, A., Kaya, Ç., Uysal, R., Çardak, M., Çitemel, N., Özdemir, E., & Gülşen, M. (2013). Dikkat Kontrol Ölçeği Türkçe Formu: Geçerlik ve Güvenirlik Çalışması [The Turkish version of the attentional control scale:the validity and reliability study]. Paper presented at VI. National Graduate Education Symposium. Retrieved from http://www.academia.edu/download/43723223/Eitim_Modelinin_renci_zerindeki_Etkilili20160314-25744-1i99q7c.pdf#page=19
- Akpınar, H. (2000). Veri tabanlarında bilgi keşfi ve veri madenciliği [Knowledge discovery and data mining in databases]. Istanbul Business Research, 29(1), 1-22. Retrieved from https://dergipark.org.tr/tr/pub/ibr/archive
- Balcı, A. (2015). Sosyal bilimlerde araştırma yöntem, teknik ve ilkeler[Research methods, techniques and principles in social sciences]. Ankara: Pegem Akademi.
- Berry M., & Linoff G., (1997). Data Mining Techniques for Marketing Sales and Customer Support. John Wiley & Sons.
- Brewer S. L. (2012). An empirical comparison of logistic regression to decision tree induction in the prediction of intimate partner violence reassault. (Doctoral dissertation). Retrieved from https://www.proquest.com/
- Bulut, N. (2015). İzleme amaçlı klinik araştırmalarda öngörülen ölçütlere göre örneklem büyüklüğünün belirlenmesi [Determination of sample size by criterias proposed on monitoring in clinical research]. (Master thesis). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/
- Çakır, Ö. (2008). Veri madenciliğinde sınıflandırma yöntemlerinin karşılaştırılması “bankacılık müşteri veri tabanı üzerinde bir uygulama”[ Comparison of classification methods in data mining "an application on banking customer database"]. (Doctoral dissertation). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/
- Cohen, J. (1988). Statistical power analysis for the behavioral sciences. NJ: Erlbaum Hillsdale.
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Authors
Mehmet Şata
*
0000-0003-2683-4997
Türkiye
Fuat Elkonca
This is me
0000-0002-2733-8891
Türkiye
Publication Date
December 30, 2020
Submission Date
May 7, 2020
Acceptance Date
December 8, 2020
Published in Issue
Year 2020 Volume: 7 Number: 2
Cited By
Attitude-Based Segmentation of Residential Self-Selection and Travel Behavior Changes Affected by COVID-19
Future Transportation
https://doi.org/10.3390/futuretransp2020030The Ultrashort Mental Health Screening Tool Is a Valid and Reliable Measure With Added Value to Support Decision-making
Clinical Orthopaedics & Related Research
https://doi.org/10.1097/CORR.0000000000002718