Araştırma Makalesi
BibTex RIS Kaynak Göster
Yıl 2020, Cilt: 41 Sayı: 1, 93 - 105, 22.03.2020
https://doi.org/10.17776/csj.544639

Öz

Kaynakça

  • [1] Koyuncugil, A. S., Özgülbaş, N., İMKB'de İşlem Gören KOBİ'lerin güçlü ve zayıf Yönleri : Bir CHAID Karar Ağacı uygulaması. Dokuz Eylül Üniversitesi İİBF Dergisi. 23(1) (2008) 1-22.
  • [2] Hand, D.,Manilla, H., Smyth, P., Principles of Data Mining. MIT, USA, (2001) 546
  • [3] Augusty, S. M.,Izudheen, S., EnsembleClassifiers A Survey: Evaluation of Ensemble classifiers and data level methods to deal withim balanced data problem in protein- protein interactions. Review of Bionformatics and Biometrics, 2 (1) (2013) 1-9.
  • [4] Lee, S. L.A., Kouzani, A. Z., Hu, E. J., Random forest based lung nodule classification aided biclustering. Computerized Medical Imaging and Graphics,34 (2010) 535-542.
  • [5] Tartar, A., Kılıç, N., Akan, A., Bagging support vector machine approaches for pulmonary nodule detection. IEEE International Conference on Control, Decision and Information Technologies.Tunisia, (2013) 047-050.
  • [6] Zeng, X. D.,Chao, S., Wang, F., 2010. Optimization of Bagging Classifiers Based on SBCB Algorithm. Proceedings of the ninth International Conference on Machine Learning and Cybernetics.11-14 July (2010) Qingdao. 262-267.
  • [7] Biggio, B.,Corona, I., Fumera, G., Giacinto, G., Roli, F., Bagging Classifiers for Fighting Poisoning Attacks in Adversarial Classification Tasks. Springer Verlag Berlin Heidelberg, (2011) 350-359.
  • [8] Breiman, L., Using iterated bagging to debias regressions. Machine Learnings, 45(3) (2001) 261-277.
  • [9] Banfield, R. E.,Hall, L. O., Bowyer, K. W., Kegelmeyer, W. P., Ensemble diversity measures and their application to thinning. Information Fusion, 6(1) (2005) 49–62.
  • [10] Alfaro, E.,Gamez, M., Garcia, N., Adabag: An R package for classification with Boosting and Bagging. Journal of Statistical Software, 54(2) (2013) 1-35.
  • [11] Kumari, G. T., A Study of Bagging and Boosting approaches to develop meta- classifier. Engineering Science and Technology: An International Journal (ESTIJ), 2(5) (2012) 850-855.
  • [12] Anonim, Öğrenci Seçme ve Yerleştirme Sistemi Yükseköğretim Programları ve Kontenjanları Kılavuzu.http://www.osym.gov.tr. (2013)
  • [13] [Zhou, Z. H., Ensemble Methods: Foundations and Algorithms.Chapman & Hall/CRC Machine Learning &Pattern Recognition Series. Boca Raton, FL, United States of America. (2012) 236.
  • [14] Zhang, C.,Ma, Y., Ensemble Learning, Chap. 1. Ensemble Machine Learning(Editor: R. Polikar). (2012) 1-17.
  • [15] Coşgun, E.,Limdi, N.A., Duarte C.W., High dimensional pharma cogenetic prediction of a continuous trait using machine learning techniques with application to warfar indose prediction in African American. Bioinformatics, 27(10) (2011) 1384-1389.
  • [16] Breiman, L., Bagging predictors. Machine Leraning, 24 (2) (1996) 123-140.
  • [17] Efron, B.,Tibshirani, R., An Introduction to the Bootstrap.Chapman and Hall. London. (1993) 430.
  • [18] Grubinger, T.,Kobel, C., Pfeiffer, K.P., Regression tree construction by bootstrap: Model search for DRG-systems applied to Austrian health-data. BMC Medical Informatics and Decision Making, 10 (9) (2010) 1-11.
  • [19] Song, M.,Breneman, C.M., Bi, J., Sukumar, N., Bennett, K.P., Cramer, S.M., Prediction of protein retention times in anion exchange chromatograph ysystems using support vector regression. Journal of Chemical Information and Computer Sciences, 42(6) (2002) 1347-1357.
  • [20] Prasad, A.M., Iverson, L.R., Liaw, A., Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems, 9 (2006) 181–199.
  • [21] Schapire, R. E., The strength of weak learnability. Machine Learning, 5 (2) (1990) 197–227.
  • [22] Schapire, R. E.,Freund, Y., Boosting: Foundations and Algorithms. MIT Press, Cambridge, London, England. (2012) 528.
  • [23] Elith, J.,Leathwick, J.R, Hastie, T., A working guide to boosted regression trees. Journal of Animal Ecology, 77(4) (2008) 802-813.
  • [24] Grove, A.J.,Schuurmans, D., Boosting in the Limit: Maximizing the Margin of Learned Ensembles. In: Proceeding of the AAAI-98. John Wiley&Sons Ltd, (1998)692-699.
  • [25] Ratsch, G.,Onoda, T., Müller, K. R., Soft Margins for AdaBoost. Machine Learning, 42 (3) (2001) 287-320.
  • [26] Bühlmann, P.,Hothorn, T., Boosting algorithms: Regularization, prediction and model fitting (with Discussion). Statistical Science,22 (2007) 477-522.
  • [27] Khoshgftaar, T. M., Hulse, J. V., Napolitano, A., Comparing Boosting and Bagging Techniques with Noisy and Imbalanced Data. IEEE Transactions on Systems Man and Cybernetics, 41 (3) (2011) 552-568.
  • [28] Chen, Z., Lin, T., Chen, R., Xie Y., Xu, H., Creating diversity in ensembles using synthetic neighborhoods of training samples. Journal Apllied Intelligence, 47 (2) (2017) 570-583.
  • [29] Kotsiantis, S. B., Bagging and Boosting variants for handling classification problems: a survey. Cambridge University Press. 29 (1) (2014) 78-100.
  • [30] Işıkhan, S., Mikrodizilim Gen İfade Çalışmalarında Genelleştirme Yöntemlerinin Regresyon Modelleri Üzerine Etkisi , PhD Thesis,. Hacettepe Üniversity, Ankara (2014)
  • [31] Dietterich, T., An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2) (2000) 139–157.
  • [32] Davidson, I., Fan, W., When Efficient Model Averaging Out- Performs Boosting and Bagging. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases.Berlin, Germany, (2006) 477-486.
  • [33] Arsov, N.,Pavlovski, M., Basnarkov, L., Kocarev, L., 2017. Generating highly accurate prediction hypotheses through collaboratative ensemble learning. Scientific Reports, 7(44649) (2017) 1-34.

Classification of the placement success in the undergraduate placement examination according to decision trees with bagging and boosting methods

Yıl 2020, Cilt: 41 Sayı: 1, 93 - 105, 22.03.2020
https://doi.org/10.17776/csj.544639

Öz

The purpose of this study is to classify the data set which is created by taking students who placed to universities from 81 provinces, in accordance with Undergraduate Placement Examination between the years 2010-2013 in Turkey, with Bagging and Boosting methods which are Ensemble algorithms. The data set which is used in the study was taken from the archives of Turk-Stat. (Turkish Statistical Institute) and OSYM (Assessment, Selection and Placement Center) and MATLAB statistical software program was used. In order to evaluate Bagging and Boosting classification performances better, the success rates of the students were grouped into two groups. According to this, the provinces that were above the average were coded as 1, and the provinces below the average were coded as 0 and dependent variables were created. The Bagging and Boosting ensemble algorithms were run accordingly. In order to evaluate the prediction abilities of the Bagging and Boosting algorithms, the data set was divided into training and testing. For this purpose, while the data between 2010-2012 yearrs were used as training data, the data of the year 2013 were used as testing data. Accuracy, precision, recall and f-measure were used to demonstrate the performance of the methods in the study. As a result, the performance in consequence of "Bagging” and “Boosting” methods were compared. According to this; it was determined that in all performance measure marginally “Boosting” method produced better results than the “Bagging” method.

Kaynakça

  • [1] Koyuncugil, A. S., Özgülbaş, N., İMKB'de İşlem Gören KOBİ'lerin güçlü ve zayıf Yönleri : Bir CHAID Karar Ağacı uygulaması. Dokuz Eylül Üniversitesi İİBF Dergisi. 23(1) (2008) 1-22.
  • [2] Hand, D.,Manilla, H., Smyth, P., Principles of Data Mining. MIT, USA, (2001) 546
  • [3] Augusty, S. M.,Izudheen, S., EnsembleClassifiers A Survey: Evaluation of Ensemble classifiers and data level methods to deal withim balanced data problem in protein- protein interactions. Review of Bionformatics and Biometrics, 2 (1) (2013) 1-9.
  • [4] Lee, S. L.A., Kouzani, A. Z., Hu, E. J., Random forest based lung nodule classification aided biclustering. Computerized Medical Imaging and Graphics,34 (2010) 535-542.
  • [5] Tartar, A., Kılıç, N., Akan, A., Bagging support vector machine approaches for pulmonary nodule detection. IEEE International Conference on Control, Decision and Information Technologies.Tunisia, (2013) 047-050.
  • [6] Zeng, X. D.,Chao, S., Wang, F., 2010. Optimization of Bagging Classifiers Based on SBCB Algorithm. Proceedings of the ninth International Conference on Machine Learning and Cybernetics.11-14 July (2010) Qingdao. 262-267.
  • [7] Biggio, B.,Corona, I., Fumera, G., Giacinto, G., Roli, F., Bagging Classifiers for Fighting Poisoning Attacks in Adversarial Classification Tasks. Springer Verlag Berlin Heidelberg, (2011) 350-359.
  • [8] Breiman, L., Using iterated bagging to debias regressions. Machine Learnings, 45(3) (2001) 261-277.
  • [9] Banfield, R. E.,Hall, L. O., Bowyer, K. W., Kegelmeyer, W. P., Ensemble diversity measures and their application to thinning. Information Fusion, 6(1) (2005) 49–62.
  • [10] Alfaro, E.,Gamez, M., Garcia, N., Adabag: An R package for classification with Boosting and Bagging. Journal of Statistical Software, 54(2) (2013) 1-35.
  • [11] Kumari, G. T., A Study of Bagging and Boosting approaches to develop meta- classifier. Engineering Science and Technology: An International Journal (ESTIJ), 2(5) (2012) 850-855.
  • [12] Anonim, Öğrenci Seçme ve Yerleştirme Sistemi Yükseköğretim Programları ve Kontenjanları Kılavuzu.http://www.osym.gov.tr. (2013)
  • [13] [Zhou, Z. H., Ensemble Methods: Foundations and Algorithms.Chapman & Hall/CRC Machine Learning &Pattern Recognition Series. Boca Raton, FL, United States of America. (2012) 236.
  • [14] Zhang, C.,Ma, Y., Ensemble Learning, Chap. 1. Ensemble Machine Learning(Editor: R. Polikar). (2012) 1-17.
  • [15] Coşgun, E.,Limdi, N.A., Duarte C.W., High dimensional pharma cogenetic prediction of a continuous trait using machine learning techniques with application to warfar indose prediction in African American. Bioinformatics, 27(10) (2011) 1384-1389.
  • [16] Breiman, L., Bagging predictors. Machine Leraning, 24 (2) (1996) 123-140.
  • [17] Efron, B.,Tibshirani, R., An Introduction to the Bootstrap.Chapman and Hall. London. (1993) 430.
  • [18] Grubinger, T.,Kobel, C., Pfeiffer, K.P., Regression tree construction by bootstrap: Model search for DRG-systems applied to Austrian health-data. BMC Medical Informatics and Decision Making, 10 (9) (2010) 1-11.
  • [19] Song, M.,Breneman, C.M., Bi, J., Sukumar, N., Bennett, K.P., Cramer, S.M., Prediction of protein retention times in anion exchange chromatograph ysystems using support vector regression. Journal of Chemical Information and Computer Sciences, 42(6) (2002) 1347-1357.
  • [20] Prasad, A.M., Iverson, L.R., Liaw, A., Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems, 9 (2006) 181–199.
  • [21] Schapire, R. E., The strength of weak learnability. Machine Learning, 5 (2) (1990) 197–227.
  • [22] Schapire, R. E.,Freund, Y., Boosting: Foundations and Algorithms. MIT Press, Cambridge, London, England. (2012) 528.
  • [23] Elith, J.,Leathwick, J.R, Hastie, T., A working guide to boosted regression trees. Journal of Animal Ecology, 77(4) (2008) 802-813.
  • [24] Grove, A.J.,Schuurmans, D., Boosting in the Limit: Maximizing the Margin of Learned Ensembles. In: Proceeding of the AAAI-98. John Wiley&Sons Ltd, (1998)692-699.
  • [25] Ratsch, G.,Onoda, T., Müller, K. R., Soft Margins for AdaBoost. Machine Learning, 42 (3) (2001) 287-320.
  • [26] Bühlmann, P.,Hothorn, T., Boosting algorithms: Regularization, prediction and model fitting (with Discussion). Statistical Science,22 (2007) 477-522.
  • [27] Khoshgftaar, T. M., Hulse, J. V., Napolitano, A., Comparing Boosting and Bagging Techniques with Noisy and Imbalanced Data. IEEE Transactions on Systems Man and Cybernetics, 41 (3) (2011) 552-568.
  • [28] Chen, Z., Lin, T., Chen, R., Xie Y., Xu, H., Creating diversity in ensembles using synthetic neighborhoods of training samples. Journal Apllied Intelligence, 47 (2) (2017) 570-583.
  • [29] Kotsiantis, S. B., Bagging and Boosting variants for handling classification problems: a survey. Cambridge University Press. 29 (1) (2014) 78-100.
  • [30] Işıkhan, S., Mikrodizilim Gen İfade Çalışmalarında Genelleştirme Yöntemlerinin Regresyon Modelleri Üzerine Etkisi , PhD Thesis,. Hacettepe Üniversity, Ankara (2014)
  • [31] Dietterich, T., An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2) (2000) 139–157.
  • [32] Davidson, I., Fan, W., When Efficient Model Averaging Out- Performs Boosting and Bagging. 10th European Conference on Principles and Practice of Knowledge Discovery in Databases.Berlin, Germany, (2006) 477-486.
  • [33] Arsov, N.,Pavlovski, M., Basnarkov, L., Kocarev, L., 2017. Generating highly accurate prediction hypotheses through collaboratative ensemble learning. Scientific Reports, 7(44649) (2017) 1-34.
Toplam 33 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Bölüm Natural Sciences
Yazarlar

Tuğba Tuğ Karoğlu 0000-0002-7197-0747

Hayrettin Okut 0000-0003-4084-8404

Yayımlanma Tarihi 22 Mart 2020
Gönderilme Tarihi 26 Mart 2019
Kabul Tarihi 21 Ocak 2020
Yayımlandığı Sayı Yıl 2020Cilt: 41 Sayı: 1

Kaynak Göster

APA Tuğ Karoğlu, T., & Okut, H. (2020). Classification of the placement success in the undergraduate placement examination according to decision trees with bagging and boosting methods. Cumhuriyet Science Journal, 41(1), 93-105. https://doi.org/10.17776/csj.544639