Türkiye Metal Sektöründe Yaşanan İş Kazalarının Rassal Orman Algoritmasıyla Tahminlenmesi
                                    
                                 
                                
                                    
                                                                                                                                                                                                                        Year 2023,
                                                                                        Volume: 13 Issue: 3,                                                                                                                 1983 - 1997,                                     01.09.2023                                
                                
                                                                                                                        
                                                                                                                                                
                                                        Ekin Karakaya Özkan
                                                                                                                    
                                                                                                            
                                                
                                                                                                    ,
                                                
                                                                                                                                
                                                                                                                                                
                                                        Hasan Basri Ulaş
                                                                                                            
                                                
                                                
                                                                                                                                                                                            
                                                                    
                                
                                
                                                                    
                                        Abstract
                                        Bu çalışmanın amacı, Çalışma ve Sosyal Güvenlik Bakanlığı (ÇSGB) tarafından kayıt altına alınan, 2013-2018 yılları arasında metal sektöründe gerçekleşen, ölümlü ve uzuv kayıplı ulusal iş kazası verilerini kullanarak makine öğrenimi (ML) yöntemiyle bir tahmin algoritması geliştirmektir. İş kazası nedenlerinin detaylı bir şekilde sınıflandırılması ve tahmin edilmesi kazaları azaltmak için gereklidir. Literatürde; iş kazalarını azaltma amacıyla kaza ile ilgili faktörleri araştırmak ve etkili tahmin modelleri oluşturmak için çeşitli ML algoritmaları kullanılmıştır. Bu çalışmada, iş kazası nedenlerini ve sonuçlarını tahmin etmek amacıyla ML yöntemlerinden birisi olan Rassal Orman (RF) algoritması kullanılmıştır. Modelin doğrulaması için 10 katlı çapraz doğrulama modeli kullanılmış ve modelin doğruluk değeri %4.7 oranında arttırılmıştır. RF algoritmasının doğruluk değeri 0.9172 olarak bulunmuştur. Metal sektöründe iş kazası nedenlerini etkileyen önemli faktörlerin analizinde özyinelemeli olarak özellik seçme (Recursive Feature Elimination - RFE) metodu kullanılmış ve en önemli özellikler kazanın ikincil tehlike kaynağı, iş günü kaybı ve kaza sebebi sapma kodu olarak bulunmuştur
                                     
                                
                                                                                                    
                                
                                                                
                                                                
                                
                                                                
                                                                    
                                        References
                                        
                                            
                                                                                                    - 
                                                        Aci, C., & Ozden, C. (2018). Predicting the Severity of Motor Vehicle Accident Injuries in Adana-Turkey Using Machine Learning Methods and Detailed Meteorological Data. International Journal of Intelligent Systems and Applications in Engineering, 6(1), 72-79. doi:10.18201/ijisae.2018637934
- 
                                                        Alizadeh, S. S., Mortazavi, S. B., & Mehdi Sepehri, M. (2015). Assessment of accident severity in the construction industry using the Bayesian theorem. International Journal of Occupational Safety and Ergonomics, 21(4), 551-557. doi:10.1080/10803548.2015.1095546
- 
                                                        Amiri, M., Ardeshir, A., Fazel Zarandi, M. H., ve Soltanaghaei, E. (2016). Pattern Extraction For High-Risk Accidents In The Construction Industry: A Data-Mining Approach. International Journal Of Injury Control And Safety Promotion, 23(3), 264-276. doi:10.1080/17457300.2015.1032979
- 
                                                        Andriyas, S., ve McKee, M. (2013). Recursive Partitioning Techniques For Modeling Irrigation Behavior. Environmental Modelling & Software, 47, 207-217. doi:https://doi.org/10.1016/j.envsoft.2013.05.011
- 
                                                        Anyfantis, I., Leka, S., Reniers, G., ve Boustras, G. (2021). Employers’ Perceived Importance And The Use (Or Non-Use) Of Workplace Risk Assessment In Micro-Sized And Small Enterprises In Europe With Focus On Cyprus. Safety Science, 139, 105256. doi:10.1016/j.ssci.2021.105256
- 
                                                        Ayhan, B. U., ve Tokdemir, O. B. (2019). Predicting The Outcome of Construction Incidents. Safety Science, 113, 91-104. doi:https://doi.org/10.1016/j.ssci.2018.11.001
- 
                                                        Azadi, S., ve Karimi-Jashni, A. (2016). Verifying The Performance of Artificial Neural Network And Multiple Linear Regression In Predicting The Mean Seasonal Municipal Solid Waste Generation Rate: A Case Study Of Fars Province, Iran. Waste Management, 48, 14-23. doi:https://doi.org/10.1016/j.wasman.2015.09.034
- 
                                                        Bazargan, M., ve Guzhva, V. S. (2011). Impact Of Gender, Age and Experience Of Pilots On General Aviation Accidents. Accident Analysis & Prevention, 43(3), 962-970. doi:https://doi.org/10.1016/j.aap.2010.11.023
- 
                                                        Bevilacqua, M., Ciarapica, F. E., ve Giacchetta, G. (2008). Industrial And Occupational Ergonomics in The Petrochemical Process Industry: A Regression Trees Approach. Accident Analysis & Prevention, 40(4), 1468-1479. doi:https://doi.org/10.1016/j.aap.2008.03.012
- 
                                                        Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. doi:10.1023/A:1010933404324
- 
                                                        Brown, D. E. (2016). Text Mining the Contributors to Rail Accidents. IEEE Transactions on Intelligent Transportation Systems, 17(2), 346-355. doi:10.1109/TITS.2015.2472580
- 
                                                        Cheng, C.-W., Leu, S.-S., Cheng, Y.-M., Wu, T.-C., ve Lin, C.-C. (2012). Applying Data Mining Techniques To Explore Factors Contributing To Occupational Injuries In Taiwan's Construction Industry. Accident Analysis & Prevention, 48, 214-222. doi:https://doi.org/10.1016/j.aap.2011.04.014
- 
                                                        Chiang, Y.-H., Wong, F., ve Liang, S. (2018). Fatal Construction Accidents in Hong Kong. Journal of Construction Engineering and Management, 144. doi:10.1061/(ASCE)CO.1943-7862.0001433
- 
                                                        Commission, E. (2012). European Statistics on Accidents at Work (ESAW) — Summary methodology. In E. Commission (Ed.). Luxembourg Publications Office of the European Union.
- 
                                                        Freund, Y., ve Schapire, R. E. (1996). Experiments With A New Boosting Algorithm. Paper presented at the icml.
- 
                                                        Friedman, J. (2000). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29. doi:10.1214/aos/1013203451
- 
                                                        Fuentes-Bargues, J. L., Sánchez-Lite, A., González-Gaya, C., Victor Fco, R.-P., ve Reniers, G. (2022). A study of situational circumstances related to Spain’s occupational accident rates in the metal sector from 2009 to 2019. Safety Science, 150, 105700. doi:https://doi.org/10.1016/j.ssci.2022.105700
- 
                                                        Garre, A., Ruiz, M. C., ve Hontoria, E. (2020). Application Of Machine Learning To Support Production Planning Of A Food Industry In The Context Of Waste Generation Under Uncertainty. Operations Research Perspectives, 7, 100147. doi:https://doi.org/10.1016/j.orp.2020.100147
- 
                                                        Ghodrati, N., Yiu, T. W., Wilkinson, S., ve Shahbazpour, M. (2018). A New Approach To Predict Safety Outcomes In The Construction Industry. Safety Science, 109, 86-94. doi:https://doi.org/10.1016/j.ssci.2018.05.016
- 
                                                        Goh, Y. M., ve Ubeynarayana, C. (2017). Construction Accident Narrative Classification: An Evaluation Of Text Mining Techniques. Accident; Analysis and Prevention, 108, 122-130. doi:10.1016/j.aap.2017.08.026
Gregoriades, A., ve Mouskos, K. C. (2013). Black Spots Identification Through A Bayesian Networks Quantification Of Accident Risk Index. Transportation Research Part C: Emerging Technologies, 28, 28-43. doi:https://doi.org/10.1016/j.trc.2012.12.008
- 
                                                        Gu, Q., Zhu, L., ve Cai, Z. (2009, 2009//). Evaluation Measures of the Classification Performance of Imbalanced Data Sets. Paper presented at the Computational Intelligence and Intelligent Systems, Berlin, Heidelberg.
- 
                                                        Gulhan, B., Ilhan, M., ve Civil, E. (2012). Occupational Accidents And Affecting Factors Of Metal Industry In A Factory In Ankara. Turkish Journal of Public Health, 10.
- 
                                                        Güllüoğlu, E., ve Güllüoğlu, A. (2019). Türkiye’de Metal Sektöründe Meydana Gelen İş Kazalarının Analizi. International Journal of Advances in Engineering and Pure Sciences. doi:10.7240/jeps.486478
- 
                                                        Guyon, I., Weston, J., Barnhill, S., ve Vapnik, V. (2002). Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, 46(1), 389-422. doi:10.1023/A:1012487302797
- 
                                                        He, X., Chen, W., Nie, B., ve Zhang, M. (2010). Classification Technique For Danger Classes Of Coal And Gas Outburst In Deep Coal Mines. Safety Science, 48(2), 173-178. doi:https://doi.org/10.1016/j.ssci.2009.07.007
- 
                                                        ILO. (2023). Safety and health at work.
- 
                                                        İş Sağlığı ve Güvenliği Bilgi Yönetim Sistemi. Retrieved from https://ibys.csgb.gov.tr/
- 
                                                        Jahangiri, M., Solukloei, H. R. J., ve Kamalinia, M. (2019). A Neuro-Fuzzy Risk Prediction Methodology For Falling From Scaffold. Safety Science, 117, 88-99. doi:https://doi.org/10.1016/j.ssci.2019.04.009
- 
                                                        Jana, D. K., Pramanik, S., Sahoo, P., ve Mukherjee, A. (2019). Interval Type-2 Fuzzy Logic And Its Application To Occupational Safety Risk Performance In Industries. Soft Computing, 23(2), 557-567. doi:10.1007/s00500-017-2860-8
- 
                                                        Jeong, H., Jang, Y., Bowman, P. J., ve Masoud, N. (2018). Classification Of Motor Vehicle Crash Injury Severity: A Hybrid Approach For Imbalanced Data. Accident Analysis & Prevention, 120, 250-261. doi:https://doi.org/10.1016/j.aap.2018.08.025
- 
                                                        Jiang, L., Xie, Y., ve Ren, T. (2019). Modelling Highly Unbalanced Crash Injury Severity Data By Ensemble Methods And Global Sensitivity Analysis. Paper presented at the Proceedings of the Transportation Research Board 98th Annual Meeting, Washington, DC, USA.
Kang, K., ve Ryu, H. (2019). Predicting Types Of Occupational Accidents At Construction Sites In Korea Using Random Forest Model. Safety Science, 120, 226-236. doi:https://doi.org/10.1016/j.ssci.2019.06.034
- 
                                                        Karacasu, M., Ergül, B., ve Altin Yavuz, A. (2014). Estimating The Causes of Traffic Accidents Using Logistic Regression And Discriminant Analysis. International Journal of Injury Control And Safety Promotion, 21(4), 305-313. doi:10.1080/17457300.2013.815632
- 
                                                        Karlaftis, M. G., ve Golias, I. (2002). Effects Of Road Geometry And Traffic Volumes On Rural Roadway Accident Rates. Accident Analysis & Prevention, 34(3), 357-365. doi:https://doi.org/10.1016/S0001-4575(01)00033-1
- 
                                                        Kifle, M., Engdaw, D., Alemu, K., Sharma, H. R., Amsalu, S., Feleke, A., ve Worku, W. (2014). Work Related Injuries And Associated Risk Factors Among Iron And Steel Industries Workers In Addis Ababa, Ethiopia. Safety Science, 63, 211-216. doi:https://doi.org/10.1016/j.ssci.2013.11.020
- 
                                                        Lantz, B., Machine Learning with R. 2013: Packt Publishing.
- 
                                                        Leu, S.-S., ve Chang, C.-M. (2013). Bayesian-Network-Based Safety Risk Assessment For Steel Construction Projects. Accident Analysis & Prevention, 54, 122-133. doi:https://doi.org/10.1016/j.aap.2013.02.019
- 
                                                        Li, G., Baker, S. P., Grabowski, J. G., Qiang, Y., McCarthy, M. L., ve Rebok, G. W. (2003). Age, Flight Experience, and Risk of Crash Involvement in a Cohort of Professional Pilots. American Journal of Epidemiology, 157(10), 874-880. doi:10.1093/aje/kwg071
- 
                                                        Li, J., Gao, F., Lin, S., Guo, M., Li, Y., Liu, H., Wen, Q. (2023). Quantum k-fold Cross-Validation for Nearest Neighbor Classification Algorithm. Physica A: Statistical Mechanics and its Applications, 611, 128435. doi:https://doi.org/10.1016/j.physa.2022.128435
- 
                                                        Li, L., Ching, W.-K., ve Liu, Z.-P. (2022). Robust Biomarker Screening From Gene Expression Data By Stable Machine Learning-Recursive Feature Elimination Methods. Computational Biology and Chemistry, 100, 107747. doi:10.1016/j.compbiolchem.2022.107747
- 
                                                        Lindberg, A.-K., Hansson, S. O., ve Rollenhagen, C. (2010). Learning from Accidents – What More Do We Need to Know?. Safety Science, 48, 714-721. doi:10.1016/j.ssci.2010.02.004
- 
                                                        Mafi, S., AbdelRazig, Y., ve Doczy, R. (2018). Machine Learning Methods to Analyze Injury Severity of Drivers from Different Age and Gender Groups. Transportation Research Record, 2672(38), 171-183. doi:10.1177/0361198118794292
- 
                                                        Matías, J. M., Rivas, T., Martín, J. E., ve Taboada, J. (2008). A Machine Learning Methodology For The Analysis Of Workplace Accidents. International Journal of Computer Mathematics, 85(3-4), 559-578. doi:10.1080/00207160701297346
- 
                                                        Meng, Q., & Weng, J. (2011). A Genetic Algorithm Approach To Assessing Work Zone Casualty Risk. Safety Science, 49(8), 1283-1288. doi:https://doi.org/10.1016/j.ssci.2011.05.001
- 
                                                        Mıstıkoğlu, G., Gerek, I. H., Erdis, E., Mumtaz Usmen, P. E., Cakan, H., ve Kazan, E. E. (2015). Decision tree analysis of construction fall accidents involving Roofers. Expert Systems with Applications, 42(4), 2256-2263. doi:https://doi.org/10.1016/j.eswa.2014.10.009
- 
                                                        Mining, E., Machine Learning for Beginners: A Complete and Phased Beginner's Guide to Learning and Understanding Machine Learning and Artificial Intelligence. 2020: Everooks Limited.
- 
                                                        Nazaripour, E., Halvani, G., Jahangiri, M., Fallahzadeh, H., ve Mohammadzadeh, M. (2018). Safety Performance Evaluation In A Steel Industry: A Short-Term Time Series Approach. Safety Science, 110, 285-290. doi:https://doi.org/10.1016/j.ssci.2018.08.028
- 
                                                        Nishimoto, T., Mukaigawa, K., Tominaga, S., Lubbe, N., Kiuchi, T., Motomura, T., ve Matsumoto, H. (2017). Serious Injury Prediction Algorithm Based On Large-Scale Data And Under-Triage Control. Accident Analysis & Prevention, 98, 266-276. doi:https://doi.org/10.1016/j.aap.2016.09.028
- 
                                                        Pacal, I. (2023). Göğüs Röntgeni Görüntülerinden Otomatik COVID-19 Teşhisi için Görü Transformatörüne Dayalı Bir Yaklaşım . Journal of the Institute of Science and Technology , 13 (2) , 778-791 . DOI: 10.21597/jist.1225156
- 
                                                        Palei, S. K., ve Das, S. K. (2009). Logistic Regression Model For Prediction Of Roof Fall Risks In Bord And Pillar Workings In Coal Mines: An Approach. Safety Science, 47(1), 88-96. doi:https://doi.org/10.1016/j.ssci.2008.01.002
- 
                                                        Park, J., Cho, C., Cho, Y., ve Kim, K. (2018). Data-Driven Monitoring System for Preventing the Collapse of Scaffolding Structures. Journal of Construction Engineering and Management, 144. doi:10.1061/(ASCE)CO.1943-7862.0001535
- 
                                                        Persona, A., Battini, D., Faccio, M., Bevilacqua, M., ve Ciarapica, F. E. (2006). Classification Of Occupational Injury Cases Using The Regression Tree Approach. International Journal of Reliability, Quality and Safety Engineering, 13(2), 171-191. doi:10.1142/S0218539306002197
- 
                                                        Rivas, T., Paz, M., Martín, J. E., Matías, J. M., García, J. F., ve Taboada, J. (2011). Explaining And Predicting Workplace Accidents Using Data-Mining Techniques. Reliability Engineering & System Safety, 96(7), 739-747. doi:https://doi.org/10.1016/j.ress.2011.03.006
- 
                                                        Sahay, A., Essentials of Data Science and Analytics: Statistical Tools, Machine Learning, and R-Statistical Software Overview. 2021: Business Expert Press.
- 
                                                        Sakhakarmi, S., Park, J., ve Cho, C. (2019). Enhanced Machine Learning Classification Accuracy for Scaffolding Safety Using Increased Features. Journal of Construction Engineering and Management, 145. doi:10.1061/(ASCE)CO.1943-7862.0001601
- 
                                                        Salguero-Caparros, F., Suarez-Cebador, M., ve Rubio-Romero, J. C. (2015). Analysis Of Investigation Reports On Occupational Accidents. Safety Science, 72, 329-336. doi:https://doi.org/10.1016/j.ssci.2014.10.005
- 
                                                        Sánchez, A., Riesgo Fernández, P., Sánchez-Lasheras, F., de Cos Juez, F., ve Garcia Nieto, P. J. (2011). Prediction Of Work-Related Accidents According To Working Conditions Using Support Vector Machines. Applied Mathematics and Computation, 218, 3539-3552. doi:10.1016/j.amc.2011.08.100
- 
                                                        Sanmiquel, L., Rossell, J. M., ve Vintró, C. (2015). Study Of Spanish Mining Accidents Using Data Mining Techniques. Safety Science, 75, 49-55. doi:https://doi.org/10.1016/j.ssci.2015.01.016
- 
                                                        Santos, K., Dias, J. P., ve Amado, C. (2022). A Literature Review Of Machine Learning Algorithms For Crash Injury Severity Prediction. Journal of Safety Research, 80, 254-269. doi:https://doi.org/10.1016/j.jsr.2021.12.007
- 
                                                        Shanthi, S., ve Ramani, R. G. (2012). Feature Relevance Analysis And Classification Of Road Traffic Accident Data Through Data Mining Techniques. Proceedings of The World Congress on Engineering and Computer Science, 1, 24-26.
- 
                                                        Shao, B., Hu, Z., Liu, Q., Chen, S., ve He, W. (2019). Fatal Accident Patterns Of Building Construction Activities In China. Safety Science, 111, 253-263. doi:https://doi.org/10.1016/j.ssci.2018.07.019
- 
                                                        Siddiqui, C., Abdel-Aty, M., ve Huang, H. (2012). Aggregate Nonparametric Safety Analysis Of Traffic Zones. Accident Analysis & Prevention, 45, 317-325. doi:https://doi.org/10.1016/j.aap.2011.07.019
- 
                                                        SGK, (2017). SGK İstatistik Yıllıkları.
- 
                                                        Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional Variable Importance For Random Forests. BMC Bioinformatics, 9. doi:10.1186/1471-2105-9-307
- 
                                                        Tang, J., Liang, J., Han, C., Li, Z., ve Huang, H. (2019). Crash Injury Severity Analysis Using A Two-Layer Stacking Framework. Accident Analysis & Prevention, 122, 226-238. doi:https://doi.org/10.1016/j.aap.2018.10.016
- 
                                                        Tixier, A. J. P., Hallowell, M. R., Rajagopalan, B., ve Bowman, D. (2016). Application Of Machine Learning To Construction Injury Prediction. Automation in Construction, 69, 102-114. doi:https://doi.org/10.1016/j.autcon.2016.05.016
- 
                                                        Umer, M., Sadiq, S., Ishaq, A., Ullah, D. S., Saher, N., ve Madni, H. (2020). Comparison Analysis of Tree Based and Ensembled Regression Algorithms for Traffic Accident Severity Prediction.
- 
                                                        Usman, T., Fu, L., ve Miranda-Moreno, L. F. (2016). Injury Severity Analysis: Comparison Of Multilevel Logistic Regression Models And Effects Of Collision Data Aggregation. Journal of Modern Transportation, 24(1), 73-87. doi:10.1007/s40534-016-0096-4
- 
                                                        Vu, L., Ng, K., Richter, A., ve An, C. (2022). Analysis Of Input Set Characteristics And Variances On K-Fold Cross Validation For A Recurrent Neural Network Model On Waste Disposal Rate Estimation. Journal of Environmental Management, 311, 114869. doi:10.1016/j.jenvman.2022.114869
- 
                                                        Wang, J., Liu, B., Fu, T., Liu, S., ve Stipancic, J. (2019). Modeling When And Where A Secondary Accident Occurs. Accident Analysis & Prevention, 130, 160-166. doi:https://doi.org/10.1016/j.aap.2018.01.024
- 
                                                        Wang, Z., Lai, C., Chen, X., Yang, B., Zhao, S., ve Bai, X. (2015). Flood Hazard Risk Assessment Model Based On Random Forest. Journal of Hydrology, 527, 1130-1141. doi:https://doi.org/10.1016/j.jhydrol.2015.06.008
- 
                                                        Weng, J., Meng, Q., ve Wang, D. Z. W. (2012). Tree-Based Logistic Regression Approach for Work Zone Casualty Risk Assessment. Risk analysis: an official publication of the Society for Risk Analysis, 33. doi:10.1111/j.1539-6924.2012.01879.x
- 
                                                        Veziroğlu, E. , Pacal, I. & Coşkunçay, A. (2023). Derin Evrişimli Sinir Ağları Kullanılarak Pirinç Hastalıklarının Sınıflandırılması . Journal of the Institute of Science and Technology , 13 (2) , 792-814 . DOI: 10.21597/jist.1265769
- 
                                                        Yağımlı, M., ve İzci, F. (2017). Türkiye’de Makine ve Teçhizatı Hariç Fabrikasyon Metal Ürünleri İmalatı Sektöründe Yaşanan İş Kazaları ve Ölümlü İş Kazası Sayılarının Tahmini. Karaelmas İş Sağlığı ve Güvenliği Dergisi, 1, 9-15. doi:10.33720/kisgd.322546
- 
                                                        Yan, X., Radwan, E., ve Abdel-Aty, M. (2005). Characteristics Of Rear-End Accidents At Signalized Intersections Using Multiple Logistic Regression Model. Accident Analysis & Prevention, 37(6), 983-995. doi:https://doi.org/10.1016/j.aap.2005.05.001
- 
                                                        Yannis, G., Papadimitriou, E., Dupont, E., ve Martensen, H. (2010). Estimation of Fatality and Injury Risk by Means of In-Depth Fatal Accident Investigation Data. Traffic Injury Prevention, 11, 492-502. doi:10.1080/15389588.2010.492536
- 
                                                        Yeoum, S., ve Lee, Y. (2013). A Study On Prediction Modeling Of Korea Millitary Aircraft Accident Occurrence. The International Journal of Industrial Engineering: Theory, Applications and Practice, 20, 562-573.
- 
                                                        Yi, W., Chan, A. P. C., Wang, X., ve Wang, J. (2016). Development Of An Early-Warning System For Site Work In Hot And Humid Environments: A Case Study. Automation in Construction, 62, 101-113. doi:https://doi.org/10.1016/j.autcon.2015.11.003
- 
                                                        Zhang, J., Li, Z., Pu, Z., ve Xu, C. (2018). Comparing Prediction Performance for Crash Injury Severity Among Various Machine Learning and Statistical Methods. IEEE Access, 6, 60079-60087. doi:10.1109/ACCESS.2018.2874979
- 
                                                        Zhen, X., Ning, Y., Du, W., ve Huang, Y. (2023). An interpretable and augmented machine-learning approach for causation analysis of major accident indicators in the offshore petroleum Industry. Process Safety and Environmental Protection. doi:https://doi.org/10.1016/j.psep.2023.03.063
 
                                     
                                                             
                                                                                
                                
                                    
                                    
                                                                                Estimation of Occupational Accidents in the Turkish Metal Industry with Random Forest Algorithm
                                    
                                 
                                
                                    
                                                                                                                                                                                                                        Year 2023,
                                                                                        Volume: 13 Issue: 3,                                                                                                                 1983 - 1997,                                     01.09.2023                                
                                
                                                                                                                        
                                                                                                                                                
                                                        Ekin Karakaya Özkan
                                                                                                                    
                                                                                                            
                                                
                                                                                                    ,
                                                
                                                                                                                                
                                                                                                                                                
                                                        Hasan Basri Ulaş
                                                                                                            
                                                
                                                
                                                                                                                                                                                            
                                                                    
                                
                                
                                                                    
                                        Abstract
                                        The aim of this study is to develop a predictive model using machine learning (ML) to identify the causes of fatalities and amputations in the metal sector based on occupational accident data collected by the Turkish Ministry of Labor and Social Security (MLSS) from 2013 to 2018. It is necessary to classify and predict occupational accident reasons in detail to prevent occupational accident. Researchers have used ML algorithm to investigate correlated factors and create effective prediction models in an effort to lower occupational accidents. In this study, we used random forest (RF) which is one of the ML algorithm to predict occupational accident reasons and consequences. 10- fold cross validation model is used for model validation and it increased %4.7 of accuracy of algorithm. Accuracy of RF is found as 0.9172. We extracted important factors that affect the occupational accident reasons at metal sector using Recursive Feature Elimination (RFE) and it is found that most important factors are secondary reason of the accident, days lost and deviation.
                                     
                                
                                                                                                    
                                
                                                                
                                                                
                                
                                                                
                                                                    
                                        References
                                        
                                            
                                                                                                    - 
                                                        Aci, C., & Ozden, C. (2018). Predicting the Severity of Motor Vehicle Accident Injuries in Adana-Turkey Using Machine Learning Methods and Detailed Meteorological Data. International Journal of Intelligent Systems and Applications in Engineering, 6(1), 72-79. doi:10.18201/ijisae.2018637934
- 
                                                        Alizadeh, S. S., Mortazavi, S. B., & Mehdi Sepehri, M. (2015). Assessment of accident severity in the construction industry using the Bayesian theorem. International Journal of Occupational Safety and Ergonomics, 21(4), 551-557. doi:10.1080/10803548.2015.1095546
- 
                                                        Amiri, M., Ardeshir, A., Fazel Zarandi, M. H., ve Soltanaghaei, E. (2016). Pattern Extraction For High-Risk Accidents In The Construction Industry: A Data-Mining Approach. International Journal Of Injury Control And Safety Promotion, 23(3), 264-276. doi:10.1080/17457300.2015.1032979
- 
                                                        Andriyas, S., ve McKee, M. (2013). Recursive Partitioning Techniques For Modeling Irrigation Behavior. Environmental Modelling & Software, 47, 207-217. doi:https://doi.org/10.1016/j.envsoft.2013.05.011
- 
                                                        Anyfantis, I., Leka, S., Reniers, G., ve Boustras, G. (2021). Employers’ Perceived Importance And The Use (Or Non-Use) Of Workplace Risk Assessment In Micro-Sized And Small Enterprises In Europe With Focus On Cyprus. Safety Science, 139, 105256. doi:10.1016/j.ssci.2021.105256
- 
                                                        Ayhan, B. U., ve Tokdemir, O. B. (2019). Predicting The Outcome of Construction Incidents. Safety Science, 113, 91-104. doi:https://doi.org/10.1016/j.ssci.2018.11.001
- 
                                                        Azadi, S., ve Karimi-Jashni, A. (2016). Verifying The Performance of Artificial Neural Network And Multiple Linear Regression In Predicting The Mean Seasonal Municipal Solid Waste Generation Rate: A Case Study Of Fars Province, Iran. Waste Management, 48, 14-23. doi:https://doi.org/10.1016/j.wasman.2015.09.034
- 
                                                        Bazargan, M., ve Guzhva, V. S. (2011). Impact Of Gender, Age and Experience Of Pilots On General Aviation Accidents. Accident Analysis & Prevention, 43(3), 962-970. doi:https://doi.org/10.1016/j.aap.2010.11.023
- 
                                                        Bevilacqua, M., Ciarapica, F. E., ve Giacchetta, G. (2008). Industrial And Occupational Ergonomics in The Petrochemical Process Industry: A Regression Trees Approach. Accident Analysis & Prevention, 40(4), 1468-1479. doi:https://doi.org/10.1016/j.aap.2008.03.012
- 
                                                        Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. doi:10.1023/A:1010933404324
- 
                                                        Brown, D. E. (2016). Text Mining the Contributors to Rail Accidents. IEEE Transactions on Intelligent Transportation Systems, 17(2), 346-355. doi:10.1109/TITS.2015.2472580
- 
                                                        Cheng, C.-W., Leu, S.-S., Cheng, Y.-M., Wu, T.-C., ve Lin, C.-C. (2012). Applying Data Mining Techniques To Explore Factors Contributing To Occupational Injuries In Taiwan's Construction Industry. Accident Analysis & Prevention, 48, 214-222. doi:https://doi.org/10.1016/j.aap.2011.04.014
- 
                                                        Chiang, Y.-H., Wong, F., ve Liang, S. (2018). Fatal Construction Accidents in Hong Kong. Journal of Construction Engineering and Management, 144. doi:10.1061/(ASCE)CO.1943-7862.0001433
- 
                                                        Commission, E. (2012). European Statistics on Accidents at Work (ESAW) — Summary methodology. In E. Commission (Ed.). Luxembourg Publications Office of the European Union.
- 
                                                        Freund, Y., ve Schapire, R. E. (1996). Experiments With A New Boosting Algorithm. Paper presented at the icml.
- 
                                                        Friedman, J. (2000). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29. doi:10.1214/aos/1013203451
- 
                                                        Fuentes-Bargues, J. L., Sánchez-Lite, A., González-Gaya, C., Victor Fco, R.-P., ve Reniers, G. (2022). A study of situational circumstances related to Spain’s occupational accident rates in the metal sector from 2009 to 2019. Safety Science, 150, 105700. doi:https://doi.org/10.1016/j.ssci.2022.105700
- 
                                                        Garre, A., Ruiz, M. C., ve Hontoria, E. (2020). Application Of Machine Learning To Support Production Planning Of A Food Industry In The Context Of Waste Generation Under Uncertainty. Operations Research Perspectives, 7, 100147. doi:https://doi.org/10.1016/j.orp.2020.100147
- 
                                                        Ghodrati, N., Yiu, T. W., Wilkinson, S., ve Shahbazpour, M. (2018). A New Approach To Predict Safety Outcomes In The Construction Industry. Safety Science, 109, 86-94. doi:https://doi.org/10.1016/j.ssci.2018.05.016
- 
                                                        Goh, Y. M., ve Ubeynarayana, C. (2017). Construction Accident Narrative Classification: An Evaluation Of Text Mining Techniques. Accident; Analysis and Prevention, 108, 122-130. doi:10.1016/j.aap.2017.08.026
Gregoriades, A., ve Mouskos, K. C. (2013). Black Spots Identification Through A Bayesian Networks Quantification Of Accident Risk Index. Transportation Research Part C: Emerging Technologies, 28, 28-43. doi:https://doi.org/10.1016/j.trc.2012.12.008
- 
                                                        Gu, Q., Zhu, L., ve Cai, Z. (2009, 2009//). Evaluation Measures of the Classification Performance of Imbalanced Data Sets. Paper presented at the Computational Intelligence and Intelligent Systems, Berlin, Heidelberg.
- 
                                                        Gulhan, B., Ilhan, M., ve Civil, E. (2012). Occupational Accidents And Affecting Factors Of Metal Industry In A Factory In Ankara. Turkish Journal of Public Health, 10.
- 
                                                        Güllüoğlu, E., ve Güllüoğlu, A. (2019). Türkiye’de Metal Sektöründe Meydana Gelen İş Kazalarının Analizi. International Journal of Advances in Engineering and Pure Sciences. doi:10.7240/jeps.486478
- 
                                                        Guyon, I., Weston, J., Barnhill, S., ve Vapnik, V. (2002). Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, 46(1), 389-422. doi:10.1023/A:1012487302797
- 
                                                        He, X., Chen, W., Nie, B., ve Zhang, M. (2010). Classification Technique For Danger Classes Of Coal And Gas Outburst In Deep Coal Mines. Safety Science, 48(2), 173-178. doi:https://doi.org/10.1016/j.ssci.2009.07.007
- 
                                                        ILO. (2023). Safety and health at work.
- 
                                                        İş Sağlığı ve Güvenliği Bilgi Yönetim Sistemi. Retrieved from https://ibys.csgb.gov.tr/
- 
                                                        Jahangiri, M., Solukloei, H. R. J., ve Kamalinia, M. (2019). A Neuro-Fuzzy Risk Prediction Methodology For Falling From Scaffold. Safety Science, 117, 88-99. doi:https://doi.org/10.1016/j.ssci.2019.04.009
- 
                                                        Jana, D. K., Pramanik, S., Sahoo, P., ve Mukherjee, A. (2019). Interval Type-2 Fuzzy Logic And Its Application To Occupational Safety Risk Performance In Industries. Soft Computing, 23(2), 557-567. doi:10.1007/s00500-017-2860-8
- 
                                                        Jeong, H., Jang, Y., Bowman, P. J., ve Masoud, N. (2018). Classification Of Motor Vehicle Crash Injury Severity: A Hybrid Approach For Imbalanced Data. Accident Analysis & Prevention, 120, 250-261. doi:https://doi.org/10.1016/j.aap.2018.08.025
- 
                                                        Jiang, L., Xie, Y., ve Ren, T. (2019). Modelling Highly Unbalanced Crash Injury Severity Data By Ensemble Methods And Global Sensitivity Analysis. Paper presented at the Proceedings of the Transportation Research Board 98th Annual Meeting, Washington, DC, USA.
Kang, K., ve Ryu, H. (2019). Predicting Types Of Occupational Accidents At Construction Sites In Korea Using Random Forest Model. Safety Science, 120, 226-236. doi:https://doi.org/10.1016/j.ssci.2019.06.034
- 
                                                        Karacasu, M., Ergül, B., ve Altin Yavuz, A. (2014). Estimating The Causes of Traffic Accidents Using Logistic Regression And Discriminant Analysis. International Journal of Injury Control And Safety Promotion, 21(4), 305-313. doi:10.1080/17457300.2013.815632
- 
                                                        Karlaftis, M. G., ve Golias, I. (2002). Effects Of Road Geometry And Traffic Volumes On Rural Roadway Accident Rates. Accident Analysis & Prevention, 34(3), 357-365. doi:https://doi.org/10.1016/S0001-4575(01)00033-1
- 
                                                        Kifle, M., Engdaw, D., Alemu, K., Sharma, H. R., Amsalu, S., Feleke, A., ve Worku, W. (2014). Work Related Injuries And Associated Risk Factors Among Iron And Steel Industries Workers In Addis Ababa, Ethiopia. Safety Science, 63, 211-216. doi:https://doi.org/10.1016/j.ssci.2013.11.020
- 
                                                        Lantz, B., Machine Learning with R. 2013: Packt Publishing.
- 
                                                        Leu, S.-S., ve Chang, C.-M. (2013). Bayesian-Network-Based Safety Risk Assessment For Steel Construction Projects. Accident Analysis & Prevention, 54, 122-133. doi:https://doi.org/10.1016/j.aap.2013.02.019
- 
                                                        Li, G., Baker, S. P., Grabowski, J. G., Qiang, Y., McCarthy, M. L., ve Rebok, G. W. (2003). Age, Flight Experience, and Risk of Crash Involvement in a Cohort of Professional Pilots. American Journal of Epidemiology, 157(10), 874-880. doi:10.1093/aje/kwg071
- 
                                                        Li, J., Gao, F., Lin, S., Guo, M., Li, Y., Liu, H., Wen, Q. (2023). Quantum k-fold Cross-Validation for Nearest Neighbor Classification Algorithm. Physica A: Statistical Mechanics and its Applications, 611, 128435. doi:https://doi.org/10.1016/j.physa.2022.128435
- 
                                                        Li, L., Ching, W.-K., ve Liu, Z.-P. (2022). Robust Biomarker Screening From Gene Expression Data By Stable Machine Learning-Recursive Feature Elimination Methods. Computational Biology and Chemistry, 100, 107747. doi:10.1016/j.compbiolchem.2022.107747
- 
                                                        Lindberg, A.-K., Hansson, S. O., ve Rollenhagen, C. (2010). Learning from Accidents – What More Do We Need to Know?. Safety Science, 48, 714-721. doi:10.1016/j.ssci.2010.02.004
- 
                                                        Mafi, S., AbdelRazig, Y., ve Doczy, R. (2018). Machine Learning Methods to Analyze Injury Severity of Drivers from Different Age and Gender Groups. Transportation Research Record, 2672(38), 171-183. doi:10.1177/0361198118794292
- 
                                                        Matías, J. M., Rivas, T., Martín, J. E., ve Taboada, J. (2008). A Machine Learning Methodology For The Analysis Of Workplace Accidents. International Journal of Computer Mathematics, 85(3-4), 559-578. doi:10.1080/00207160701297346
- 
                                                        Meng, Q., & Weng, J. (2011). A Genetic Algorithm Approach To Assessing Work Zone Casualty Risk. Safety Science, 49(8), 1283-1288. doi:https://doi.org/10.1016/j.ssci.2011.05.001
- 
                                                        Mıstıkoğlu, G., Gerek, I. H., Erdis, E., Mumtaz Usmen, P. E., Cakan, H., ve Kazan, E. E. (2015). Decision tree analysis of construction fall accidents involving Roofers. Expert Systems with Applications, 42(4), 2256-2263. doi:https://doi.org/10.1016/j.eswa.2014.10.009
- 
                                                        Mining, E., Machine Learning for Beginners: A Complete and Phased Beginner's Guide to Learning and Understanding Machine Learning and Artificial Intelligence. 2020: Everooks Limited.
- 
                                                        Nazaripour, E., Halvani, G., Jahangiri, M., Fallahzadeh, H., ve Mohammadzadeh, M. (2018). Safety Performance Evaluation In A Steel Industry: A Short-Term Time Series Approach. Safety Science, 110, 285-290. doi:https://doi.org/10.1016/j.ssci.2018.08.028
- 
                                                        Nishimoto, T., Mukaigawa, K., Tominaga, S., Lubbe, N., Kiuchi, T., Motomura, T., ve Matsumoto, H. (2017). Serious Injury Prediction Algorithm Based On Large-Scale Data And Under-Triage Control. Accident Analysis & Prevention, 98, 266-276. doi:https://doi.org/10.1016/j.aap.2016.09.028
- 
                                                        Pacal, I. (2023). Göğüs Röntgeni Görüntülerinden Otomatik COVID-19 Teşhisi için Görü Transformatörüne Dayalı Bir Yaklaşım . Journal of the Institute of Science and Technology , 13 (2) , 778-791 . DOI: 10.21597/jist.1225156
- 
                                                        Palei, S. K., ve Das, S. K. (2009). Logistic Regression Model For Prediction Of Roof Fall Risks In Bord And Pillar Workings In Coal Mines: An Approach. Safety Science, 47(1), 88-96. doi:https://doi.org/10.1016/j.ssci.2008.01.002
- 
                                                        Park, J., Cho, C., Cho, Y., ve Kim, K. (2018). Data-Driven Monitoring System for Preventing the Collapse of Scaffolding Structures. Journal of Construction Engineering and Management, 144. doi:10.1061/(ASCE)CO.1943-7862.0001535
- 
                                                        Persona, A., Battini, D., Faccio, M., Bevilacqua, M., ve Ciarapica, F. E. (2006). Classification Of Occupational Injury Cases Using The Regression Tree Approach. International Journal of Reliability, Quality and Safety Engineering, 13(2), 171-191. doi:10.1142/S0218539306002197
- 
                                                        Rivas, T., Paz, M., Martín, J. E., Matías, J. M., García, J. F., ve Taboada, J. (2011). Explaining And Predicting Workplace Accidents Using Data-Mining Techniques. Reliability Engineering & System Safety, 96(7), 739-747. doi:https://doi.org/10.1016/j.ress.2011.03.006
- 
                                                        Sahay, A., Essentials of Data Science and Analytics: Statistical Tools, Machine Learning, and R-Statistical Software Overview. 2021: Business Expert Press.
- 
                                                        Sakhakarmi, S., Park, J., ve Cho, C. (2019). Enhanced Machine Learning Classification Accuracy for Scaffolding Safety Using Increased Features. Journal of Construction Engineering and Management, 145. doi:10.1061/(ASCE)CO.1943-7862.0001601
- 
                                                        Salguero-Caparros, F., Suarez-Cebador, M., ve Rubio-Romero, J. C. (2015). Analysis Of Investigation Reports On Occupational Accidents. Safety Science, 72, 329-336. doi:https://doi.org/10.1016/j.ssci.2014.10.005
- 
                                                        Sánchez, A., Riesgo Fernández, P., Sánchez-Lasheras, F., de Cos Juez, F., ve Garcia Nieto, P. J. (2011). Prediction Of Work-Related Accidents According To Working Conditions Using Support Vector Machines. Applied Mathematics and Computation, 218, 3539-3552. doi:10.1016/j.amc.2011.08.100
- 
                                                        Sanmiquel, L., Rossell, J. M., ve Vintró, C. (2015). Study Of Spanish Mining Accidents Using Data Mining Techniques. Safety Science, 75, 49-55. doi:https://doi.org/10.1016/j.ssci.2015.01.016
- 
                                                        Santos, K., Dias, J. P., ve Amado, C. (2022). A Literature Review Of Machine Learning Algorithms For Crash Injury Severity Prediction. Journal of Safety Research, 80, 254-269. doi:https://doi.org/10.1016/j.jsr.2021.12.007
- 
                                                        Shanthi, S., ve Ramani, R. G. (2012). Feature Relevance Analysis And Classification Of Road Traffic Accident Data Through Data Mining Techniques. Proceedings of The World Congress on Engineering and Computer Science, 1, 24-26.
- 
                                                        Shao, B., Hu, Z., Liu, Q., Chen, S., ve He, W. (2019). Fatal Accident Patterns Of Building Construction Activities In China. Safety Science, 111, 253-263. doi:https://doi.org/10.1016/j.ssci.2018.07.019
- 
                                                        Siddiqui, C., Abdel-Aty, M., ve Huang, H. (2012). Aggregate Nonparametric Safety Analysis Of Traffic Zones. Accident Analysis & Prevention, 45, 317-325. doi:https://doi.org/10.1016/j.aap.2011.07.019
- 
                                                        SGK, (2017). SGK İstatistik Yıllıkları.
- 
                                                        Strobl, C., Boulesteix, A. L., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional Variable Importance For Random Forests. BMC Bioinformatics, 9. doi:10.1186/1471-2105-9-307
- 
                                                        Tang, J., Liang, J., Han, C., Li, Z., ve Huang, H. (2019). Crash Injury Severity Analysis Using A Two-Layer Stacking Framework. Accident Analysis & Prevention, 122, 226-238. doi:https://doi.org/10.1016/j.aap.2018.10.016
- 
                                                        Tixier, A. J. P., Hallowell, M. R., Rajagopalan, B., ve Bowman, D. (2016). Application Of Machine Learning To Construction Injury Prediction. Automation in Construction, 69, 102-114. doi:https://doi.org/10.1016/j.autcon.2016.05.016
- 
                                                        Umer, M., Sadiq, S., Ishaq, A., Ullah, D. S., Saher, N., ve Madni, H. (2020). Comparison Analysis of Tree Based and Ensembled Regression Algorithms for Traffic Accident Severity Prediction.
- 
                                                        Usman, T., Fu, L., ve Miranda-Moreno, L. F. (2016). Injury Severity Analysis: Comparison Of Multilevel Logistic Regression Models And Effects Of Collision Data Aggregation. Journal of Modern Transportation, 24(1), 73-87. doi:10.1007/s40534-016-0096-4
- 
                                                        Vu, L., Ng, K., Richter, A., ve An, C. (2022). Analysis Of Input Set Characteristics And Variances On K-Fold Cross Validation For A Recurrent Neural Network Model On Waste Disposal Rate Estimation. Journal of Environmental Management, 311, 114869. doi:10.1016/j.jenvman.2022.114869
- 
                                                        Wang, J., Liu, B., Fu, T., Liu, S., ve Stipancic, J. (2019). Modeling When And Where A Secondary Accident Occurs. Accident Analysis & Prevention, 130, 160-166. doi:https://doi.org/10.1016/j.aap.2018.01.024
- 
                                                        Wang, Z., Lai, C., Chen, X., Yang, B., Zhao, S., ve Bai, X. (2015). Flood Hazard Risk Assessment Model Based On Random Forest. Journal of Hydrology, 527, 1130-1141. doi:https://doi.org/10.1016/j.jhydrol.2015.06.008
- 
                                                        Weng, J., Meng, Q., ve Wang, D. Z. W. (2012). Tree-Based Logistic Regression Approach for Work Zone Casualty Risk Assessment. Risk analysis: an official publication of the Society for Risk Analysis, 33. doi:10.1111/j.1539-6924.2012.01879.x
- 
                                                        Veziroğlu, E. , Pacal, I. & Coşkunçay, A. (2023). Derin Evrişimli Sinir Ağları Kullanılarak Pirinç Hastalıklarının Sınıflandırılması . Journal of the Institute of Science and Technology , 13 (2) , 792-814 . DOI: 10.21597/jist.1265769
- 
                                                        Yağımlı, M., ve İzci, F. (2017). Türkiye’de Makine ve Teçhizatı Hariç Fabrikasyon Metal Ürünleri İmalatı Sektöründe Yaşanan İş Kazaları ve Ölümlü İş Kazası Sayılarının Tahmini. Karaelmas İş Sağlığı ve Güvenliği Dergisi, 1, 9-15. doi:10.33720/kisgd.322546
- 
                                                        Yan, X., Radwan, E., ve Abdel-Aty, M. (2005). Characteristics Of Rear-End Accidents At Signalized Intersections Using Multiple Logistic Regression Model. Accident Analysis & Prevention, 37(6), 983-995. doi:https://doi.org/10.1016/j.aap.2005.05.001
- 
                                                        Yannis, G., Papadimitriou, E., Dupont, E., ve Martensen, H. (2010). Estimation of Fatality and Injury Risk by Means of In-Depth Fatal Accident Investigation Data. Traffic Injury Prevention, 11, 492-502. doi:10.1080/15389588.2010.492536
- 
                                                        Yeoum, S., ve Lee, Y. (2013). A Study On Prediction Modeling Of Korea Millitary Aircraft Accident Occurrence. The International Journal of Industrial Engineering: Theory, Applications and Practice, 20, 562-573.
- 
                                                        Yi, W., Chan, A. P. C., Wang, X., ve Wang, J. (2016). Development Of An Early-Warning System For Site Work In Hot And Humid Environments: A Case Study. Automation in Construction, 62, 101-113. doi:https://doi.org/10.1016/j.autcon.2015.11.003
- 
                                                        Zhang, J., Li, Z., Pu, Z., ve Xu, C. (2018). Comparing Prediction Performance for Crash Injury Severity Among Various Machine Learning and Statistical Methods. IEEE Access, 6, 60079-60087. doi:10.1109/ACCESS.2018.2874979
- 
                                                        Zhen, X., Ning, Y., Du, W., ve Huang, Y. (2023). An interpretable and augmented machine-learning approach for causation analysis of major accident indicators in the offshore petroleum Industry. Process Safety and Environmental Protection. doi:https://doi.org/10.1016/j.psep.2023.03.063