TY  - JOUR
T1  - PROJE EFOR TAHMİNİ İÇİN MAKİNE ÖĞRENMESİ MODELLERİNİN GELİŞTİRİLMESİ VE SHAP YÖNTEMİ KULLANILARAK AÇIKLANMASI
TT  - DEVELOPMENT OF MACHINE LEARNING MODELS FOR PROJECT EFFORT PREDICTION AND EXPLANATION USING SHAP METHOD
AU  - Görmez, Yasin
AU  - Kaya, Esma Nur
PY  - 2025
DA  - June
Y2  - 2025
DO  - 10.21923/jesd.1604190
JF  - Mühendislik Bilimleri ve Tasarım Dergisi
JO  - MBTD
PB  - Süleyman Demirel University
WT  - DergiPark
SN  - 1308-6693
SP  - 528
EP  - 544
VL  - 13
IS  - 2
LA  - tr
AB  - Günümüzde işletmeler, dijitalleşen dünyaya uyum sağlamak için başarılı bir proje yönetimine ihtiyaç duymaktadır. Özellikle yazılım projelerinin artışıyla birlikte, doğru efor tahmini yapmak kritik bir süreç haline gelmiştir. Efor tahmini, projenin tamamlanması için gereken zaman ve iş gücü miktarını tahmin ederek maliyetleri optimize etmeyi sağlamaktadır. Bu çalışmada, proje efor tahmini için rastgele orman, karar ağacı, doğrusal regresyon, yapay sinir ağı, GradientBoost ve AdaBoost yöntemleri geliştirilmiştir. china_original, cocomonasa_v1, humans2, nasa93, usp05 ve usp05-ft gibi 6 farklı veri seti üzerinde 50 tekrarlayan sınama yaklaşımı kullanılarak analizler yapılmış ve modeller ortalama mutlak hata, ortalama logaritmik kare hatası, belirleme katsayısı ve ortalama göreli büyüklük hatası metrikleri kullanılarak karşılaştırılmıştır. Analiz sonuçlarına göre yapay sinir ağı, rastgele orman, karar ağaçları ve GradientBoost modellerinin farklı veri setlerinde en başarılı modeller olduğu gözlemlenmiştir. Proje efor tahmini için ise en başarılı modelin karar ağacı olduğu kanısına varılmıştır. Çalışmada yapılan diğer bir analizde ise, geliştirilen modeller açıklamalı yapay zekâ modeli olan SHAP (SHapley Additive exPlanations) yöntemi kullanılarak açıklanmıştır. Yapılan açıklamalar doğrultusunda her bir veri seti için bazı özniteliklerin model karar alma sürecinde diğer özniteliklere göre daha etkili olduğu gözlemlenmiştir.
KW  - Proje Yönetimi
KW  - Proje Efor Tahmini
KW  - Makine Öğrenmesi
KW  - Açıklamalı Yapay Zekâ
KW  - SHAP
N2  - In today’s digitalized world, successful project management has become essential for businesses, with accurate effort estimation emerging as a critical component due to the increasing prevalence of software projects. Effort estimation facilitates cost optimization by predicting the time and labor required for project completion. This study developed and evaluated six regression models—random forest, decision tree, linear regression, neural network, GradientBoost, and AdaBoost—for project effort estimation. Analyses were conducted on six datasets (china_original, cocomonasa_v1, humans2, nasa93, usp05, and usp05-ft) using 50 repeated holdout tests, and model performance was compared using metrics such as mean absolute error, mean squared logarithmic error, coefficient of determination, and mean relative magnitude error. The results demonstrated that artificial neural networks, random forest, decision trees, and GradientBoost models performed most effectively across the datasets, with the decision tree identified as the best-performing model for effort estimation. Furthermore, the study utilized the SHAP (Shapley Additive Explanations) method to interpret the models, revealing that specific attributes were more influential than others in the decision-making process across different datasets.
CR  - Amruthnath, N., &amp; Gupta, T. (2018). A research study on unsupervised machine learning algorithms for early fault detection in predictive maintenance. 2018 5th International Conference on Industrial Engineering and Applications (ICIEA), 355-361. https://doi.org/10.1109/IEA.2018.8387124
CR  - Azzeh, M., &amp; Nassif, A. B. (2016). A hybrid model for estimating software project effort from Use Case Points. Applied Soft Computing, 49, 981-989. https://doi.org/10.1016/j.asoc.2016.05.008
CR  - BaniMustafa, A. (2018). Predicting Software Effort Estimation Using Machine Learning Techniques. 2018 8th International Conference on Computer Science and Information Technology (CSIT), 249-256. https://doi.org/10.1109/CSIT.2018.8486222
CR  - Braga, P. L., Oliveira, A. L. I., Ribeiro, G. H. T., &amp; Meira, S. R. L. (2007). Bagging Predictors for Estimation of Software Project Effort. 2007 International Joint Conference on Neural Networks, 1595-1600. https://doi.org/10.1109/IJCNN.2007.4371196
CR  - Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
Dragicevic, S., Celar, S., &amp; Turic, M. (2017). Bayesian network model for task effort estimation in agile software development. Journal of Systems and Software, 127, 109-119. https://doi.org/10.1016/j.jss.2017.01.027
CR  - Draper, N. R., &amp; Smith, H. (1998). Applied Regression Analysis. John Wiley &amp; Sons.
Effor Estimation Datasets. (2024). GitHub. https://github.com/danrodgar/DASE/tree/master/datasets/effortEstimation
CR  - Elish, M. O. (2009). Improved estimation of software project effort using multiple additive regression trees. Expert Systems with Applications, 36(7), 10774-10778. https://doi.org/10.1016/j.eswa.2009.02.013
CR  - Erasmus, I. P., &amp; Daneva, M. (2013). ERP Effort Estimation Based on Expert Judgments. 2013 Joint Conference of the 23rd International Workshop on Software Measurement and the 8th International Conference on Software Process and Product Measurement, 104-109. https://doi.org/10.1109/IWSM-Mensura.2013.25
CR  - Freund, Y., &amp; Schapire, R. E. (1997). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55(1), 119-139. https://doi.org/10.1006/jcss.1997.1504
CR  - Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29(5), 1189-1232.
CR  - Gilpin, L. H., Bau, D., Yuan, B. Z., Bajwa, A., Specter, M., &amp; Kagal, L. (2019). Explaining Explanations: An Overview of Interpretability of Machine Learning (arXiv:1806.00069). arXiv. https://doi.org/10.48550/arXiv.1806.00069
CR  - Hameed, S., Elsheikh, Y., &amp; Azzeh, M. (2023). An optimized case-based software project effort estimation using genetic algorithm. Information and Software Technology, 153, 107088. https://doi.org/10.1016/j.infsof.2022.107088
CR  - Haris, M., Chua, F.-F., &amp; Lim, A. H.-L. (2023). An Ensemble-Based Framework to Estimate Software Project Effort. 2023 IEEE 8th International Conference On Software Engineering and Computer Systems (ICSECS), 47-52. https://doi.org/10.1109/ICSECS58457.2023.10256337
CR  - Hosni, M. (2024). Comparative Analysis of Single and Ensemble Support Vector Regression Methods for Software Development Effort Estimation: Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, 509-516. https://doi.org/10.5220/0013072300003838
CR  - Jorgensen, M. (2005). Practical guidelines for expert-judgment-based software effort estimation. IEEE Software, 22(3), 57-63. IEEE Software. https://doi.org/10.1109/MS.2005.73
CR  - Kassaymeh, S., Alweshah, M., Al-Betar, M. A., Hammouri, A. I., &amp; Al-Ma’aitah, M. A. (2024). Software effort estimation modeling and fully connected artificial neural network optimization using soft computing techniques. Cluster Computing, 27(1), 737-760. https://doi.org/10.1007/s10586-023-03979-y
CR  - Kim, J.-H. (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics &amp; Data Analysis, 53(11), 3735-3745. https://doi.org/10.1016/j.csda.2009.04.009
CR  - Kitchenham, B., &amp; Mendes, E. (2004). Software productivity measurement using multiple size measures. IEEE Transactions on Software Engineering, 30(12), 1023-1035. IEEE Transactions on Software Engineering. https://doi.org/10.1109/TSE.2004.104
CR  - Kök, İ. (2024). Açıklanabilir Yapay Zekaya Dayalı Müşteri Kaybı Analizi ve Elde Tutma Önerisi. Mühendislik Bilimleri ve Araştırmaları Dergisi, 6(1), Article 1. https://doi.org/10.46387/bjesr.1344414
CR  - LeCun, Y., Bengio, Y., &amp; Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539
CR  - Lipton, Z. C. (2017). The Mythos of Model Interpretability (arXiv:1606.03490). arXiv. https://doi.org/10.48550/arXiv.1606.03490
CR  - Lundberg, S. M., &amp; Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
CR  - Molnar, C. (2020). Interpretable Machine Learning. Lulu.com.
Mukherjee, S., &amp; Malu, R. K. (2014). Optimization of project effort estimate using neural network. 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies, 406-410. https://doi.org/10.1109/ICACCCT.2014.7019474
CR  - Mustafa, E. I., &amp; Osman, R. (2024). A random forest model for early-stage software effort estimation for the SEERA dataset. Information and Software Technology, 169, 107413. https://doi.org/10.1016/j.infsof.2024.107413
CR  - Özgür, A. S., Tarhan, Ç., Komesli, M., &amp; Tecim, V. (2023). Yapay Zeka Teknikleri Kullanılarak Proje Üretim Sistemlerinin Tasarımı ve Geliştirilmesi. Journal of Information Systems and Management Research, 5(1), Article 1. https://doi.org/10.59940/jismar.1214440
CR  - Plumb, G., Molitor, D., &amp; Talwalkar, A. S. (2018). Model Agnostic Supervised Local Explanations. Advances in Neural Information Processing Systems, 31. https://proceedings.neurips.cc/paper_files/paper/2018/hash/b495ce63ede0f4efc9eec62cb947c162-Abstract.html
CR  - Pospieszny, P., Czarnacka-Chrobot, B., &amp; Kobylinski, A. (2018). An effective approach for software project effort and duration estimation with machine learning algorithms. Journal of Systems and Software, 137, 184-196. https://doi.org/10.1016/j.jss.2017.11.066
CR  - Qi, F., Jing, X.-Y., Zhu, X., Xie, X., Xu, B., &amp; Ying, S. (2017). Software effort estimation based on open source projects: Case study of Github. Information and Software Technology, 92, 145-157. https://doi.org/10.1016/j.infsof.2017.07.015
CR  - Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81-106. https://doi.org/10.1007/BF00116251
CR  - Ritu, &amp; Bhambri, P. (2023). Software Effort Estimation with Machine Learning – A Systematic Literature Review. Içinde Agile Software Development (ss. 291-308). John Wiley &amp; Sons, Ltd. https://doi.org/10.1002/9781119896838.ch15
scikit-learn: Machine learning in Python. (2024). https://scikit-learn.org/stable/
CR  - Seber, G. A. F., &amp; Lee, A. J. (2012). Linear Regression Analysis. John Wiley &amp; Sons.
SHAP. (2024). https://shap.readthedocs.io/en/latest/
CR  - Sharma, S., &amp; Vijayvargiya, S. (2021). Applying Soft Computing Techniques for Software Project Effort Estimation Modelling. Içinde V. Nath &amp; J. K. Mandal (Ed.), Nanoelectronics, Circuits and Communication Systems (ss. 211-227). Springer. https://doi.org/10.1007/978-981-15-7486-3_21
CR  - Sharma, S., &amp; Vijayvargiya, S. (2022). Modeling of software project effort estimation: A comparative performance evaluation of optimized soft computing-based methods. International Journal of Information Technology, 14(5), 2487-2496. https://doi.org/10.1007/s41870-022-00962-5
CR  - Sharma, S., &amp; Vijayvargiya, S. (2023). An Optimized Neuro-Fuzzy Network for Software Project Effort Estimation. IETE Journal of Research, 69(10), 6855-6866. https://doi.org/10.1080/03772063.2022.2027282
CR  - Shepperd, M., Schofield, C., &amp; Kitchenham, B. (1996). Effort estimation using analogy. Proceedings of IEEE 18th International Conference on Software Engineering, 170-178. https://doi.org/10.1109/ICSE.1996.493413
CR  - Şengüneş, B., &amp; Öztürk, N. (2023). An Artificial Neural Network Model for Project Effort Estimation. Systems, 11(2), Article 2. https://doi.org/10.3390/systems11020091
CR  - Tsunoda, M., Monden, A., Keung, J., &amp; Matsumoto, K. (2012). Incorporating Expert Judgment into Regression Models of Software Effort Estimation. 2012 19th Asia-Pacific Software Engineering Conference, 1, 374-379. https://doi.org/10.1109/APSEC.2012.58
CR  - Tuncer, Y. (2024). Artificial Intelligence Based Risk Analsis in Project Management [M.Eng.]. https://www.proquest.com/docview/3143984193/abstract/4ABE168365614041PQ/1
CR  - Walkerden, F., &amp; Jeffery, R. (1999). An Empirical Study of Analogy-based Software Effort Estimation. Empirical Software Engineering, 4(2), 135-158. https://doi.org/10.1023/A:1009872202035
CR  - Willmott, C. J., &amp; Matsuura, K. (2005). Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research, 30(1), 79-82. https://doi.org/10.3354/cr030079
UR  - https://doi.org/10.21923/jesd.1604190
L1  - https://dergipark.org.tr/en/download/article-file/4452875
ER  -