TY  - JOUR
T1  - The Efficiency of Regularization Method on Model Success in Issue Type Prediction Problem
TT  - Sorun Türü Tahmini Probleminde Düzenlileştirme Yönteminin Model Başarısı Üzerindeki Etkisi
AU  - Alsaç, Ali
AU  - Yenisey, Mehmet Mutlu
AU  - Ganiz, Murat Can
AU  - Dağtekin, Mustafa
AU  - Ulusinan, Taner
PY  - 2023
DA  - December
Y2  - 2023
DO  - 10.26650/acin.1394019
JF  - Acta Infologica
JO  - ACIN
PB  - Istanbul University
WT  - DergiPark
SN  - 2602-3563
SP  - 360
EP  - 383
VL  - 7
IS  - 2
LA  - en
AB  - Designing a prediction method with machine learning algorithms and increasing the prediction success is one of the most important research areas and aims of today. Models designed using classification algorithms are frequently used especially in problem types that require prediction. In this study, real life data is used to answer the question of which problem type should be included in the Information Technology Service Management (ITSM) system. An important step in the search for a solution is to examine the dataset with regularization methods. Experimental results have been obtained to establish the overfitting or underfitting balance of the dataset with L1 and L2 regularization methods. While the Root-Mean-Square Error (RMSE) value was approximately 0.13 in the regression model without regularization, this value was found to be approximately 0.083 after L1 regularization.With the regularized dataset, new results were obtained using Artificial Neural Network (ANN), Logistic Regression (LR), Support Vector Machine (SVM) classifier algorithms. SVM algorithm was the most successful model with a performance of approximately 0.73. It is followed by LR and ANN respectively. Accuracy, Precision, Recall and F1Score were used as evaluation metrics. It is seen that the use of regularization methods, especially in the preparation of real-life data for use in machine learning or other artificial intelligence research, will contribute to increasing the success level of the model.
KW  - IT service management
KW  - regularization
KW  - prediction
KW  - classification
N2  - Matematik düzleminde bir tahmin yöntemi tasarlamak ve başarılı sonuçlarından faydalanmak günümüzün önemli araştırma alanlarından ve amaçlarından biri olarak öne çıkmaktadır. Sınıflandırma algoritmaları kullanılarak tasarlanan modeller özellikle tahmin gerektiren problem türlerinde sıklıkla kullanılmaktadır. Çalışmada gerçek hayat verileri kullanılarak bir gerçek hayat problemi olan müşteriden gelen çözüm talebinin Bilgi Teknolojisi Hizmet Yönetimi (BTHY) sistemi içinde hangi sorun tipine dahil edilmesi gerektiği sorusuna cevap aranmaktadır. Çözüm arayışının önemli bir aşamasında veri kümesinin Regülarizasyon yöntemleri ile incelenmesi yer almaktadır. L1 ve L2 regülarizasyon yöntemleri ile veri kümesinin overfitting ya daunderfitting dengesinin kurulması için deneysel sonuçlar alınmıştır. Regülarizasyon uygulanmamış regresyon modelinde Kök Ortalama Kare Hatası (RMSE) değeri yaklaşık olarak 0,13 iken L1 regülarizasyonu sonucunda bu değer yaklaşık 0,083 olarak bulunmuştur. Düzenlileştirilmiş veri kümesi ile Yapay Sinir Ağları (YSA), Lojistik Regresyon (LR), Destek Vektör Makinaları (DVM) sınıflandırıcı algoritmaları kullanılarak yeni sonuçlar elde edilmiştir. DVM algoritması yaklaşık 0,73 başarım sonucu ile en başarılı model olmuştur. Sırasıyla LR ve YSA takip etmektedir. Değerlendirme metrikleri olarak Accuracy, Precision, Recall ve F1Score kullanılmıştır. Özellikle gerçek hayat verilerinin makina öğrenmesi ya da diğer yapay zeka araştırmalarında kullanımı için hazırlanması aşamasında Regülarizasyon yöntemlerinden faydalanmanın modelin başarı düzeyinin artmasında katkısı olacağı görülmektedir.
CR  - ALAN, A., &amp; KARABATAK, M. (2020). Veri Seti - Sınıflandırma İlişkisinde Performansa Etki Eden Faktörlerin Değerlendirilmesi. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 32(2). https://doi.org/10.35234/fumbd.738007 google scholar
CR  - Anderson D, M. G. (1992). Artificial Neural Networks Technology. Kaman Sciences Corporation, 258(6). google scholar
CR  - Aran, O., Yildiz, O. T., &amp; Alpaydin, E. (2009). An incremental framework based on cross-validation for estimating the architecture of a multilayer perceptron. International Journal of Pattern Recognition and Artificial Intelligence, 23(2). https://doi.org/10.1142/S0218001409007132 google scholar
CR  - ARSLAN, H., ÜNEŞ, F., DEMİRCİ, M., TAŞAR, B., &amp; YILMAZ, A. (2020). Keban Baraj Gölü Seviye Değişiminin ANFIS ve Destek Vektör Makineleri ile Tahmini. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 3(2). https://doi.org/10.47495/okufbed.748018 google scholar
CR  - Bharambe, Prof. P., Bagul, B., Dandekar, S., &amp; Ingle, P. (2022). Used Car Price Prediction using Different Machine Learning Algorithms. International Journal for Research in Applied Science and Engineering Technology, 10(4). https://doi.org/10.22214/yraset.2022.41300 google scholar
CR  - Bhattacharya, P., Neamtiu, I., &amp; Shelton, C. R. (2012). Automated, highly-accurate, bug assignment using machine learning and tossing graphs. Journal of Systems and Software, 85(10). https://doi.org/10.1016/j.jss.2012.04.053 google scholar
CR  - ÇELİK, E., DAL, D., &amp; AYDİN, T. (2021). Duygu Analizi İçin Veri Madenciliği Sınıflandırma Algoritmalarının Karşılaştırılması. European Journal of Science and Technology. https://doi.org/10.31590/ejosat.905259 google scholar
CR  - Cook, D., Dixon, P., Duckworth, W. M., Kaiser, M. S., Koehler, K., Meeker, W. Q., &amp; Stephenson, W. R. (2001). Binary Response and Logistic Regression Analysis. Project Beyond Traditional Statistical Methods, Ml. google scholar
CR  - Cortes, C., &amp; Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3). https://doi.org/10.1023/A:1022627411411 google scholar
CR  - Dangeti, P. (2017). Statistics for Machine Learning: Techniques for exploring supervised, unsupervised, and reinforcement learning models with Python and R. In Packt Publishing. google scholar
CR  - Deloitte, &amp; TUBISAD. (2022). Bilgi ve İletişim Teknolojileri Sektörü 2021 Pazar Verileri. google scholar
CR  - Doğan, C. (2021). İstatistiksel ve Makine Öğrenme ile Derin Sinir Ağlarında Hiper-Parametre Seçimi İçin Melez Yaklaşım [Yüksek Lisans]. Hacettepe Üniversitesi. google scholar
CR  - Domingos, P. (2000). A Unified Bias-Variance Decomposition. Aaai/Iaai. google scholar
CR  - Emmert-Streib, F., &amp; Dehmer, M. (2019). High-Dimensional LASSO-Based Computational Regression Models: Regularization, Shrinkage, and Selection. In Machine Learning and Knowledge Extraction (Vol. 1, Issue 1). https://doi.org/10.3390/make1010021 google scholar
CR  - Friedrich, S., Groll, A., Ickstadt, K., Kneib, T., Pauly, M., Rahnenführer, J., &amp; Friede, T. (2023). Regularization approaches in clin-ical biostatistics: A review of methods and their applications. In Statistical Methods in Medical Research (Vol. 32, Issue 2). https://doi.org/10.1177/09622802221133557 google scholar
CR  - Geman, S., Bienenstock, E., &amp; Doursat, R. (1992). Neural Networks and the Bias/Variance Dilemma. Neural Computation, 4(1). https://doi.org/10.1162/neco.1992.4.1.1 google scholar
CR  - Golam Kibria, B. M., &amp; Banik, S. (2016). Some ridge regression estimators and their performances. Journal of Modern Applied Statistical Methods, 15(1). https://doi.org/10.22237/jmasm/1462075860 google scholar
CR  - Goldberg, N., &amp; Eckstein, J. (2012). Sparse weighted voting classifier selection and its linear programming relaxations. Information Processing Letters, 112(12). https://doi.org/10.1016/j.ipl.2012.03.004 google scholar
CR  - Ha, J., Kambe, M., &amp; Pe, J. (2011). Data Mining: Concepts and Techniques. In Data Mining: Concepts and Techniques. https://doi.org/10.1016/C2009-0-61819-5 google scholar
CR  - Hair, J. F., Black, W. C., Babin, B. J., &amp; Anderson, R. E. (2010). Multivariate Data Analysis. In Vectors. https://doi.org/10.1016/j.ypharm.2011.02.019 google scholar
CR  - Hautamaki, V., Kinnunen, T., Sedlak, F., Lee, K. A., Ma, B., &amp; Li, H. (2013). Sparse classifier fusion for speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 21(8). https://doi.org/10.1109/TASL.2013.2256895 google scholar
CR  - Helming, J., Arndt, H., Hodaie, Z., Koegel, M., &amp; Narayan, N. (2011). Automatic Assignment of Work Items. Communications in Computer and Information Science, 230. https://doi.org/10.1007/978-3-642-23391-3_17 google scholar
CR  - Jonsson, L., Borg, M., Broman, D., Sandahl, K., Eldh, S., &amp; Runeson, P. (2016). Automated bug assignment: Ensemble-based machine learning in large scale industrial contexts. Empirical Software Engineering, 21(4). https://doi.org/10.1007/s10664-015-9401-9 google scholar
CR  - Koçoğlu, F. Ö., &amp; Esnaf, Ş. (2022). Machine Learning Approach and Model Performance Evaluation for Tele-Marketing Success Classification. International Journal of Business Analytics, 9(5). https://doi.org/10.4018/yban.298014 google scholar
CR  - Koçoğlu, F. Ö., &amp; Özcan, T. (2022). A grid search optimized extreme learning machine approach for customer churn prediction. Journal of Engineering Research. google scholar
CR  - Kotsilieris, T., Anagnostopoulos, I., &amp; Livieris, I. E. (2022). Special Issue: Regularization Techniques for Machine Learning and Their Appli-cations. In Electronics (Switzerland) (Vol. 11, Issue 4). https://doi.org/10.3390/electronics11040521 google scholar
CR  - Li, N., &amp; Zhou, Z. H. (2009). Selective ensemble under regularization framework. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5519 LNCS. https://doi.org/10.1007/978-3-642-02326-2_30 google scholar
CR  - Mantovani, R. G., Horvath, T., Cerri, R., Vanschoren, J., &amp; De Carvalho, A. C. P. L. F. (2017). Hyper-Parameter Tuning of a Decision Tree Induction Algorithm. Proceedings - 2016 5th Brazilian Conference on Intelligent Systems, BRACIS 2016. https://doi.org/10.1109/BRACIS.2016.018 google scholar
CR  - Mao, S., Xiong, L., Jiao, L. C., Zhang, S., &amp; Chen, B. (2013). Weighted ensemble based on 0-1 matrix decomposition. Electronics Letters, 49(2). https://doi.org/10.1049/el.2012.3528 google scholar
CR  - Muller, A. C., &amp; Guido, S. (2017). Introduction to Machine Learning with Python: a guide for data scientist. In O’Reilly Media, Inc. google scholar
CR  - Orynbassar, A., Sapazhanov, Y., Kadyrov, S., &amp; Lyublinskaya, I. (2022). Application of ROC Curve Analysis for Predicting Students’ Passing Grade in a Course Based on Prerequisite Grades. Mathematics, 10(12). https://doi.org/10.3390/math10122084 google scholar
CR  - ÖZBİLGİN, F., &amp; KURNAZ, Ç. (2023). Koroner Arter Hastalığının İris Görüntülerinden Yerel İkili Örüntüler ve Yapay Sinir Ağı Kullanılarak Tahmini. Karadeniz Fen Bilimleri Dergisi, 13(2). https://doi.org/10.31466/kfbd.1266996 google scholar
CR  - Özgür, A., Nar, F., &amp; Erdem, H. (2018). Sparsity-driven weighted ensemble classifier. International Journal of Computational Intelligence Systems, 11 (1). https://doi.Org/10.2991/ijcis.11.1.73 google scholar
CR  - Paper, D. (2019). Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python. In Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python. https://doi.org/10.1007/978-1-4842-5373-1 google scholar
CR  - Sahoo, K., Samal, A. K., Pramanik, J., &amp; Pani, S. K. (2019). Exploratory data analysis using python. International Journal of Innovative Technology and Exploring Engineering, 8(12), 4727-4735. https://doi.org/10.35940/jitee.L3591.1081219 google scholar
CR  - Şen, M. U., &amp; Erdogan, H. (2013). Linear classifier combination and selection using group sparse regularization and hinge loss. Pattern Recognition Letters, 34(3). https://doi.org/10.1016/j.patrec.2012.10.008 google scholar
CR  - ŞENEL, S., &amp; ALATLI, B. (2014). Lojistik Regresyon Analizinin Kullanıldığı Makaleler Üzerine Bir İnceleme. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 5(1). https://doi.org/10.21031/epod.67169 google scholar
CR  - Sinha, K., Uddin, Z., Kawsar, H. I., Islam, S., Deen, M. J., &amp; Howlader, M.M.R. (2023). Analyzing chronic disease biomarkers using electrochem-ical sensors and artificial neural networks. In TrAC - Trends in Analytical Chemistry (Vol. 158). https://doi.org/10.1016/j.trac.2022.116861 google scholar
CR  - Şipal, B., Ormancı, B. B., &amp; Altınel, A. B. (2022). KELİME ANLAM BULANIKLIĞINI GİDERMEK İÇİN DİFÜZYON REGÜLARİZASYON VE NORMALİZASYON TEKNİKLERİNİN KULLANILMASI. In MÜHENDİSLİK ALANINDA ULUSLARARASI ARAŞTIRMALAR VI (pp. 75-85). google scholar
CR  - Tanyildizi, E., &amp; Demirtas, F. (2019). Hiper Parametre Optimizasyonu Hyper Parameter Optimization. 1st International Infor-matics and Software Engineering Conference: Innovative Technologies for Digital Transformation, IISEC 2019 - Proceedings. https://doi.org/10.1109/UBMYK48245.2019.8965609 google scholar
CR  - TAZEGÜL, A., YAZARKAN, H., &amp; YERDELEN, C. (2016). İşletmelerin Finansal Başarılı ve Başarısız Olma Durumlarının Veri Madenciliği ve Lojistik Regresyon Analizi İle Tahmin Edilebilirliği. Ege Akademik Bakis (Ege Academic Review), 16(1). https://doi.org/10.21121/eab.2016119960 google scholar
CR  - Tian, Y., &amp; Zhang, Y. (2022). A comprehensive survey on regularization strategies in machine learning. In Information Fusion (Vol. 80). https://doi.org/10.1016/j.inffus.2021.11.005 google scholar
CR  - Tinoco, S. L. J. L., Santos, H. G., Menotti, D., Santos, A. B., &amp; Dos Santos, J. A. (2013). Ensemble of classifiers for remote sensed hyperspectral land cover analysis: An approach based on Linear Programming and Weighted Linear Combination. International Geoscience and Remote Sensing Symposium (IGARSS). https://doi.org/10.1109/IGARSS.2013.6723730 google scholar
CR  - Witten, I. H., Frank, E., Hall, M. A., &amp; Pal, C. J. (2016). Data Mining: Practical Machine Learning Tools and Techniques. In Data Mining: Practical Machine Learning Tools and Techniques. google scholar
CR  - YENİSU, E. (2021). Ekonomiyi Harekete Geçiren Kilit Sektörler Nelerdir? Türkiye Üzerine Bir Girdi-Çıktı Analizi. İzmir İktisat Dergisi, 36(4). https://doi.org/10.24988/je.721302 google scholar
CR  - YETGINLER, B., &amp; ATACAK, İ. (2020). Sentiment Analyses on Movie Reviews using Machine Learning-Based Methods. Artificial Intelligence Studies, 3(2). https://doi.org/10.30855/ais.2020.03.02.01 google scholar
CR  - Yildiz, M., Alsac, A., Ulusinan, T., Ganiz, M. C., &amp; Yenisey, M. M. (2022). IT Support Ticket Completion Time Prediction. Proceedings - 7th International Conference on Computer Science and Engineering, UBMK 2022. https://doi.org/10.1109/UBMK55850.2022.9919591 google scholar
CR  - Yin, X. C., Huang, K., Hao, H. W., Iqbal, K., &amp; Wang, Z. Bin. (2012). Classifier ensemble using a heuristic learning with sparsity and diversity. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7664 LNCS(PART 2). https://doi.org/10.1007/978-3-642-34481-7_13 google scholar
CR  - Yoon, B.L. (1989). Artificial neural network technology. ACM SIGSMALL/PC Notes, 15(3), 3-16. https://doi.org/10.1145/74657.74658 google scholar
CR  - Zhang, L., &amp; Zhou, W. Da. (2011). Sparse ensembles using weighted combination methods based on linear programming. Pattern Recognition, 44(1). https://doi.org/10.1016/j.patcog.2010.07.021 google scholar
CR  - Zibran, M.F. (2016). On the effectiveness of labeled latent dirichlet allocation in automatic bug-report categorization. Proceedings - International Conference on Software Engineering. https://doi.org/10.1145/2889160.2892646 google scholar
UR  - https://doi.org/10.26650/acin.1394019
L1  - https://dergipark.org.tr/en/download/article-file/3551155
ER  -