Araştırma Makalesi
BibTex RIS Kaynak Göster

Prediction of Pneumoconiosis Diagnosis with Clinical Data: Performance Analysis of Random Forest and Logistic Regression

Yıl 2026, Cilt: 9 Sayı: 1 , 12 - 25 , 31.03.2026
https://doi.org/10.55517/mrr.1743958
https://izlik.org/JA68TB33RA

Öz

Pneumoconiosis is a chronic, irreversible lung disease that develops as a result of occupational exposure to inorganic dust, though it is preventable. The diagnosis of pneumoconiosis is based on occupational exposure history, clinical evaluation, and radiological findings; pathological confirmation is rarely required. Today, data-driven approaches are gaining increasing importance for the purposes of early diagnosis and risk estimation. Objective: The aim of this study is to develop machine learning-based models based on clinical and occupational data to predict the diagnosis of pneumoconiosis and to compare the classification performances of these models. Methods: A dataset including various variables such as age, years of employment in the sector, smoking status, radiological imaging results, and previous disease history was utilized within the scope of the study. Classification models were established using Logistic Regression and Random Forest (RF) algorithms, and their performances were evaluated using metrics such as Area Under the ROC Curve (AUC), sensitivity, specificity, and F1-score. Results: According to the obtained results, the Random Forest model exhibited higher performance with a test accuracy of 77.7%, an AUC of 0.868, and an F1-score of 0.816. The Logistic Regression model produced results with 75% accuracy, an AUC of 0.811, and an F1-score of 0.80, which are lower but possess high interpretability. According to the variable importance coefficients; HRCT findings, age, occupational duration, smoking pack-years, dust including silica, dyspnea, cough, gender, and sputum emerged as the most effective predictors in forecasting the disease. Conclusion: Machine learning algorithms offer alternative tools for the early diagnosis and risk assessment of occupational lung diseases such as pneumoconiosis; specifically, the Random Forest model demonstrates high success in terms of classification.

Kaynakça

  • Qi XM, Luo Y, Song MY, Liu Y, Shu T, Liu Y, et al. Pneumoconiosis: current status and future prospects. Chin Med J (Engl). 2021 Apr;134(8):898-907.
  • Akira M, Suganuma N. Imaging diagnosis of pneumoconiosis with predominant nodular pattern: HRCT and pathologic findings. Clin Imaging. 2023 May;97:28-33.
  • Cullinan P, Reid P. Pneumoconiosis. Prim Care Respir J. 2013 Jun;22(2):249-52.
  • Cohen RA, Petsonk EL, Rose C, Young B, Regier M, Najmuddin A, et al. Lung pathology in U.S. coal workers with rapidly progressive pneumoconiosis implicates silica and silicates. Am J Respir Crit Care Med. 2016 Mar;193(6):673-80.
  • Su X, Kong X, Yu X, Zhang X. Incidence and influencing factors of occupational pneumoconiosis: a systematic review and meta-analysis. BMJ Open. 2023;13:e065114.
  • Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019 Jan;25(1):44-56.
  • França RP, Borges Monteiro AC, Arthur R, Iano Y. An overview of deep learning in big data, image, and signal processing in the modern digital age. In: Piuri V, Raj S, Genovese A, Srivastava R, editors. Trends in Deep Learning Methodologies. Academic Press; 2021. p. 63-87.
  • Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019 Jan;25(1):24-9.
  • Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. N Engl J Med. 2019 Apr;380(14):1347-58.
  • Athey S, Tibshirani J, Wager S. Generalized random forests. Ann Stat. 2019 Apr;47(2):1148-78.
  • Chen X, Yin W, Wu J, Luo Y, Wu J, Li G, et al. A nomogram for predicting lung-related diseases among construction workers in Wuhan, China. Front Public Health. 2022 Dec;10:1032188.
  • Acito F. Logistic regression. In: Predictive Analytics with KNIME. Springer, Cham. 2023. p. 125-67.
  • Azen R, Traxel N. Using dominance analysis to determine predictor importance in logistic regression. J Educ Behav Stat. 2009 Jun;34(3):319-47.
  • Çorbacıoğlu ŞK, Aksel G. Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turk J Emerg Med. 2023 Oct 3;23(4):195-8.
  • Wang H, Meng R, Wang X, Si Z, Zhao Z, Lu H, et al. Development and internal validation of risk assessment models for chronic obstructive pulmonary disease in coal workers. Int J Environ Res Public Health. 2023 Feb 18;20(4):3655.
  • Miller T. Explanation in artificial intelligence: insights from the social sciences. Artif Intell. 2019 Feb;267:1-38.
  • Canbay P, Demircioğlu Z. Endüstri 5.0’a doğru: zeki otonom sistemlerde etik ve ahlaki sorumluluklar. AJIT-E Acad J Inf Technol. 2021 May;12(45):106-23.

Pnömokonyoz Tanısında Klinik Verilerle Öngörü: Random Forest ve Lojistik Regresyonun Performans Analizi

Yıl 2026, Cilt: 9 Sayı: 1 , 12 - 25 , 31.03.2026
https://doi.org/10.55517/mrr.1743958
https://izlik.org/JA68TB33RA

Öz

Pnömokonyoz, özellikle inorganik tozlara mesleki maruziyet sonucu gelişen, önlenebilir ancak geri dönüşümsüz özellikte seyreden kronik bir akciğer hastalığıdır. Pnömokonyoz tanısı mesleki maruziyet öyküsü, klinik değerlendirme ve radyolojik bulgulara dayanarak konulmakta, patolojik doğrulama nadiren gerekmektedir. Günümüzde, erken tanı ve risk tahmini amacıyla veri temelli yaklaşımlar giderek daha fazla önem kazanmaktadır. Amaç: Pnömokonyoz tanısını öngörmeye yönelik olarak klinik ve mesleki verilere dayalı makine öğrenmesi tabanlı modeller geliştirmek ve bu modellerin sınıflandırma performanslarını karşılaştırmaktır. Yöntem: Çalışma kapsamında yaş, sektörde çalışma yılı, sigara kullanımı, radyolojik görüntüleme sonuçları ve önceki hastalık öyküsü gibi çeşitli değişkenlerin yer aldığı veri seti kullanılmıştır. Lojistik regresyon ve rastgele orman (RF) algoritmaları ile sınıflandırma modelleri kurulmuş ve performansları ROC eğrisi altındaki alan (AUC), duyarlılık, özgüllük ve F1 skoru gibi metriklerle değerlendirilmiştir. Bulgular: Elde edilen sonuçlara göre, rastgele orman modeli %77,7 test doğruluğu, 0,868 AUC ve 0,816 F1 skoru ile daha yüksek performans sergilemiştir. Lojistik regresyon modeli ise %75 doğruluk, 0,811 AUC ve 0,80 F1 skoru ile daha düşük ancak yorumlanabilirliği yüksek sonuçlar üretmiştir. Değişken önem katsayılarına göre HRCT bulguları, yaş ve sektörel çalışma süresi, sigara paket yıl, silika dâhil toz, dispne, öksürük, cinsiyet, balgam değişkenleri hastalığın öngörülmesinde en etkili belirteçler olarak öne çıkmıştır. Sonuç: Makine öğrenmesi algoritmaları, Pnömokonyoz gibi mesleki akciğer hastalıklarının erken tanısı ve risk değerlendirmesi için alternatif araçlar sunmakta; özellikle rastgele orman modeli sınıflandırma açısından yüksek başarı göstermektedir.

Kaynakça

  • Qi XM, Luo Y, Song MY, Liu Y, Shu T, Liu Y, et al. Pneumoconiosis: current status and future prospects. Chin Med J (Engl). 2021 Apr;134(8):898-907.
  • Akira M, Suganuma N. Imaging diagnosis of pneumoconiosis with predominant nodular pattern: HRCT and pathologic findings. Clin Imaging. 2023 May;97:28-33.
  • Cullinan P, Reid P. Pneumoconiosis. Prim Care Respir J. 2013 Jun;22(2):249-52.
  • Cohen RA, Petsonk EL, Rose C, Young B, Regier M, Najmuddin A, et al. Lung pathology in U.S. coal workers with rapidly progressive pneumoconiosis implicates silica and silicates. Am J Respir Crit Care Med. 2016 Mar;193(6):673-80.
  • Su X, Kong X, Yu X, Zhang X. Incidence and influencing factors of occupational pneumoconiosis: a systematic review and meta-analysis. BMJ Open. 2023;13:e065114.
  • Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019 Jan;25(1):44-56.
  • França RP, Borges Monteiro AC, Arthur R, Iano Y. An overview of deep learning in big data, image, and signal processing in the modern digital age. In: Piuri V, Raj S, Genovese A, Srivastava R, editors. Trends in Deep Learning Methodologies. Academic Press; 2021. p. 63-87.
  • Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. 2019 Jan;25(1):24-9.
  • Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. N Engl J Med. 2019 Apr;380(14):1347-58.
  • Athey S, Tibshirani J, Wager S. Generalized random forests. Ann Stat. 2019 Apr;47(2):1148-78.
  • Chen X, Yin W, Wu J, Luo Y, Wu J, Li G, et al. A nomogram for predicting lung-related diseases among construction workers in Wuhan, China. Front Public Health. 2022 Dec;10:1032188.
  • Acito F. Logistic regression. In: Predictive Analytics with KNIME. Springer, Cham. 2023. p. 125-67.
  • Azen R, Traxel N. Using dominance analysis to determine predictor importance in logistic regression. J Educ Behav Stat. 2009 Jun;34(3):319-47.
  • Çorbacıoğlu ŞK, Aksel G. Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turk J Emerg Med. 2023 Oct 3;23(4):195-8.
  • Wang H, Meng R, Wang X, Si Z, Zhao Z, Lu H, et al. Development and internal validation of risk assessment models for chronic obstructive pulmonary disease in coal workers. Int J Environ Res Public Health. 2023 Feb 18;20(4):3655.
  • Miller T. Explanation in artificial intelligence: insights from the social sciences. Artif Intell. 2019 Feb;267:1-38.
  • Canbay P, Demircioğlu Z. Endüstri 5.0’a doğru: zeki otonom sistemlerde etik ve ahlaki sorumluluklar. AJIT-E Acad J Inf Technol. 2021 May;12(45):106-23.
Toplam 17 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular İş ve Meslek Hastalıkları
Bölüm Araştırma Makalesi
Yazarlar

Deniz Boz Eravcı 0000-0002-8336-5501

Mehmet Erdem Alagüney 0000-0001-7380-0250

Gönderilme Tarihi 16 Temmuz 2025
Kabul Tarihi 23 Şubat 2026
Yayımlanma Tarihi 31 Mart 2026
DOI https://doi.org/10.55517/mrr.1743958
IZ https://izlik.org/JA68TB33RA
Yayımlandığı Sayı Yıl 2026 Cilt: 9 Sayı: 1

Kaynak Göster

Vancouver 1.Deniz Boz Eravcı, Mehmet Erdem Alagüney. Pnömokonyoz Tanısında Klinik Verilerle Öngörü: Random Forest ve Lojistik Regresyonun Performans Analizi. MRR. 01 Mart 2026;9(1):12-25. doi:10.55517/mrr.1743958