TY - JOUR T1 - PROJE EFOR TAHMİNİ İÇİN MAKİNE ÖĞRENMESİ MODELLERİNİN GELİŞTİRİLMESİ VE SHAP YÖNTEMİ KULLANILARAK AÇIKLANMASI TT - DEVELOPMENT OF MACHINE LEARNING MODELS FOR PROJECT EFFORT PREDICTION AND EXPLANATION USING SHAP METHOD AU - Görmez, Yasin AU - Kaya, Esma Nur PY - 2025 DA - June Y2 - 2025 DO - 10.21923/jesd.1604190 JF - Mühendislik Bilimleri ve Tasarım Dergisi JO - MBTD PB - Süleyman Demirel University WT - DergiPark SN - 1308-6693 SP - 528 EP - 544 VL - 13 IS - 2 LA - tr AB - Günümüzde işletmeler, dijitalleşen dünyaya uyum sağlamak için başarılı bir proje yönetimine ihtiyaç duymaktadır. Özellikle yazılım projelerinin artışıyla birlikte, doğru efor tahmini yapmak kritik bir süreç haline gelmiştir. Efor tahmini, projenin tamamlanması için gereken zaman ve iş gücü miktarını tahmin ederek maliyetleri optimize etmeyi sağlamaktadır. Bu çalışmada, proje efor tahmini için rastgele orman, karar ağacı, doğrusal regresyon, yapay sinir ağı, GradientBoost ve AdaBoost yöntemleri geliştirilmiştir. china_original, cocomonasa_v1, humans2, nasa93, usp05 ve usp05-ft gibi 6 farklı veri seti üzerinde 50 tekrarlayan sınama yaklaşımı kullanılarak analizler yapılmış ve modeller ortalama mutlak hata, ortalama logaritmik kare hatası, belirleme katsayısı ve ortalama göreli büyüklük hatası metrikleri kullanılarak karşılaştırılmıştır. Analiz sonuçlarına göre yapay sinir ağı, rastgele orman, karar ağaçları ve GradientBoost modellerinin farklı veri setlerinde en başarılı modeller olduğu gözlemlenmiştir. Proje efor tahmini için ise en başarılı modelin karar ağacı olduğu kanısına varılmıştır. Çalışmada yapılan diğer bir analizde ise, geliştirilen modeller açıklamalı yapay zekâ modeli olan SHAP (SHapley Additive exPlanations) yöntemi kullanılarak açıklanmıştır. Yapılan açıklamalar doğrultusunda her bir veri seti için bazı özniteliklerin model karar alma sürecinde diğer özniteliklere göre daha etkili olduğu gözlemlenmiştir. KW - Proje Yönetimi KW - Proje Efor Tahmini KW - Makine Öğrenmesi KW - Açıklamalı Yapay Zekâ KW - SHAP N2 - In today’s digitalized world, successful project management has become essential for businesses, with accurate effort estimation emerging as a critical component due to the increasing prevalence of software projects. Effort estimation facilitates cost optimization by predicting the time and labor required for project completion. This study developed and evaluated six regression models—random forest, decision tree, linear regression, neural network, GradientBoost, and AdaBoost—for project effort estimation. Analyses were conducted on six datasets (china_original, cocomonasa_v1, humans2, nasa93, usp05, and usp05-ft) using 50 repeated holdout tests, and model performance was compared using metrics such as mean absolute error, mean squared logarithmic error, coefficient of determination, and mean relative magnitude error. The results demonstrated that artificial neural networks, random forest, decision trees, and GradientBoost models performed most effectively across the datasets, with the decision tree identified as the best-performing model for effort estimation. Furthermore, the study utilized the SHAP (Shapley Additive Explanations) method to interpret the models, revealing that specific attributes were more influential than others in the decision-making process across different datasets. CR - Amruthnath, N., & Gupta, T. (2018). A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), 355-361. https://doi.org/10.1109/IEA.2018.8387124 CR - Azzeh, M., & Nassif, A. B. (2016). A hybrid model for estimating software project effort from Use Case Points. Applied Soft Computing, 49, 981-989. https://doi.org/10.1016/j.asoc.2016.05.008 CR - BaniMustafa, A. (2018). Predicting Software Effort Estimation Using Machine Learning Techniques. 2018 8th International Conference on Computer Science and Information Technology (CSIT), 249-256. https://doi.org/10.1109/CSIT.2018.8486222 CR - Braga, P. L., Oliveira, A. L. I., Ribeiro, G. H. T., & Meira, S. R. L. (2007). Bagging Predictors for Estimation of Software Project Effort. 2007 International Joint Conference on Neural Networks, 1595-1600. https://doi.org/10.1109/IJCNN.2007.4371196 CR - Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324 Dragicevic, S., Celar, S., & Turic, M. (2017). Bayesian network model for task effort estimation in agile software development. Journal of Systems and Software, 127, 109-119. https://doi.org/10.1016/j.jss.2017.01.027 CR - Draper, N. R., & Smith, H. (1998). Applied Regression Analysis. John Wiley & Sons. Effor Estimation Datasets. (2024). GitHub. https://github.com/danrodgar/DASE/tree/master/datasets/effortEstimation CR - Elish, M. O. (2009). Improved estimation of software project effort using multiple additive regression trees. Expert Systems with Applications, 36(7), 10774-10778. https://doi.org/10.1016/j.eswa.2009.02.013 CR - Erasmus, I. P., & Daneva, M. (2013). ERP Effort Estimation Based on Expert Judgments. 2013 Joint Conference of the 23rd International Workshop on Software Measurement and the 8th International Conference on Software Process and Product Measurement, 104-109. https://doi.org/10.1109/IWSM-Mensura.2013.25 CR - Freund, Y., & Schapire, R. E. (1997). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55(1), 119-139. https://doi.org/10.1006/jcss.1997.1504 CR - Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29(5), 1189-1232. CR - Gilpin, L. H., Bau, D., Yuan, B. Z., Bajwa, A., Specter, M., & Kagal, L. (2019). Explaining Explanations: An Overview of Interpretability of Machine Learning (arXiv:1806.00069). arXiv. https://doi.org/10.48550/arXiv.1806.00069 CR - Hameed, S., Elsheikh, Y., & Azzeh, M. (2023). An optimized case-based software project effort estimation using genetic algorithm. Information and Software Technology, 153, 107088. https://doi.org/10.1016/j.infsof.2022.107088 CR - Haris, M., Chua, F.-F., & Lim, A. H.-L. (2023). An Ensemble-Based Framework to Estimate Software Project Effort. 2023 IEEE 8th International Conference On Software Engineering and Computer Systems (ICSECS), 47-52. https://doi.org/10.1109/ICSECS58457.2023.10256337 CR - Hosni, M. (2024). Comparative Analysis of Single and Ensemble Support Vector Regression Methods for Software Development Effort Estimation: Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, 509-516. https://doi.org/10.5220/0013072300003838 CR - Jorgensen, M. (2005). Practical guidelines for expert-judgment-based software effort estimation. IEEE Software, 22(3), 57-63. IEEE Software. https://doi.org/10.1109/MS.2005.73 CR - Kassaymeh, S., Alweshah, M., Al-Betar, M. A., Hammouri, A. I., & Al-Ma’aitah, M. A. (2024). Software effort estimation modeling and fully connected artificial neural network optimization using soft computing techniques. Cluster Computing, 27(1), 737-760. https://doi.org/10.1007/s10586-023-03979-y CR - Kim, J.-H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics & Data Analysis, 53(11), 3735-3745. https://doi.org/10.1016/j.csda.2009.04.009 CR - Kitchenham, B., & Mendes, E. (2004). Software productivity measurement using multiple size measures. IEEE Transactions on Software Engineering, 30(12), 1023-1035. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2004.104 CR - Kök, İ. (2024). Açıklanabilir Yapay Zekaya Dayalı Müşteri Kaybı Analizi ve Elde Tutma Önerisi. Mühendislik Bilimleri ve Araştırmaları Dergisi, 6(1), Article 1. https://doi.org/10.46387/bjesr.1344414 CR - LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539 CR - Lipton, Z. C. (2017). The Mythos of Model Interpretability (arXiv:1606.03490). arXiv. https://doi.org/10.48550/arXiv.1606.03490 CR - Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html CR - Molnar, C. (2020). Interpretable Machine Learning. Lulu.com. Mukherjee, S., & Malu, R. K. (2014). Optimization of project effort estimate using neural network. 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies, 406-410. https://doi.org/10.1109/ICACCCT.2014.7019474 CR - Mustafa, E. I., & Osman, R. (2024). A random forest model for early-stage software effort estimation for the SEERA dataset. Information and Software Technology, 169, 107413. https://doi.org/10.1016/j.infsof.2024.107413 CR - Özgür, A. S., Tarhan, Ç., Komesli, M., & Tecim, V. (2023). Yapay Zeka Teknikleri Kullanılarak Proje Üretim Sistemlerinin Tasarımı ve Geliştirilmesi. Journal of Information Systems and Management Research, 5(1), Article 1. https://doi.org/10.59940/jismar.1214440 CR - Plumb, G., Molitor, D., & Talwalkar, A. S. (2018). Model Agnostic Supervised Local Explanations. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/b495ce63ede0f4efc9eec62cb947c162-Abstract.html CR - Pospieszny, P., Czarnacka-Chrobot, B., & Kobylinski, A. (2018). An effective approach for software project effort and duration estimation with machine learning algorithms. Journal of Systems and Software, 137, 184-196. https://doi.org/10.1016/j.jss.2017.11.066 CR - Qi, F., Jing, X.-Y., Zhu, X., Xie, X., Xu, B., & Ying, S. (2017). Software effort estimation based on open source projects: Case study of Github. Information and Software Technology, 92, 145-157. https://doi.org/10.1016/j.infsof.2017.07.015 CR - Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81-106. https://doi.org/10.1007/BF00116251 CR - Ritu, & Bhambri, P. (2023). Software Effort Estimation with Machine Learning – A Systematic Literature Review. Içinde Agile Software Development (ss. 291-308). John Wiley & Sons, Ltd. https://doi.org/10.1002/9781119896838.ch15 scikit-learn: Machine learning in Python. (2024). https://scikit-learn.org/stable/ CR - Seber, G. A. F., & Lee, A. J. (2012). Linear Regression Analysis. John Wiley & Sons. SHAP. (2024). https://shap.readthedocs.io/en/latest/ CR - Sharma, S., & Vijayvargiya, S. (2021). Applying Soft Computing Techniques for Software Project Effort Estimation Modelling. Içinde V. Nath & J. K. Mandal (Ed.), Nanoelectronics, Circuits and Communication Systems (ss. 211-227). Springer. https://doi.org/10.1007/978-981-15-7486-3_21 CR - Sharma, S., & Vijayvargiya, S. (2022). Modeling of software project effort estimation: A comparative performance evaluation of optimized soft computing-based methods. International Journal of Information Technology, 14(5), 2487-2496. https://doi.org/10.1007/s41870-022-00962-5 CR - Sharma, S., & Vijayvargiya, S. (2023). An Optimized Neuro-Fuzzy Network for Software Project Effort Estimation. IETE Journal of Research, 69(10), 6855-6866. https://doi.org/10.1080/03772063.2022.2027282 CR - Shepperd, M., Schofield, C., & Kitchenham, B. (1996). Effort estimation using analogy. Proceedings of IEEE 18th International Conference on Software Engineering, 170-178. https://doi.org/10.1109/ICSE.1996.493413 CR - Şengüneş, B., & Öztürk, N. (2023). An Artificial Neural Network Model for Project Effort Estimation. Systems, 11(2), Article 2. https://doi.org/10.3390/systems11020091 CR - Tsunoda, M., Monden, A., Keung, J., & Matsumoto, K. (2012). Incorporating Expert Judgment into Regression Models of Software Effort Estimation. 2012 19th Asia-Pacific Software Engineering Conference, 1, 374-379. https://doi.org/10.1109/APSEC.2012.58 CR - Tuncer, Y. (2024). Artificial Intelligence Based Risk Analsis in Project Management [M.Eng.]. https://www.proquest.com/docview/3143984193/abstract/4ABE168365614041PQ/1 CR - Walkerden, F., & Jeffery, R. (1999). An Empirical Study of Analogy-based Software Effort Estimation. Empirical Software Engineering, 4(2), 135-158. https://doi.org/10.1023/A:1009872202035 CR - Willmott, C. J., & Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30(1), 79-82. https://doi.org/10.3354/cr030079 UR - https://doi.org/10.21923/jesd.1604190 L1 - https://dergipark.org.tr/en/download/article-file/4452875 ER -