Araştırma Makalesi
BibTex RIS Kaynak Göster

Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction

Yıl 2025, Cilt: 14 Sayı: 4, 245 - 253, 30.12.2025
https://doi.org/10.46810/tdfd.1800883

Öz

This study concentrates on the weighted average ensemble-based prediction of pIC50 value for Human Dihydroorotate Dehydrogenase (hDHODH) using hybrid molecular fingerprints. By querying the ChEMBL database for IC50 data, a diverse collection of 1585 molecules was obtained, and these values were converted to pIC50 values to develop ensemble-based prediction models. We used the weighted average (W.Avg) of Light Gradient Boosting Machine (LGBM), Bootstrap Aggregating (Bagging), and Random Forest (RF) algorithms to estimate pIC50 values. Model performance was evaluated using 5x3 repeated K-fold cross-validation (CV). Root mean square error (RMSE) and mean squared error (MSE) were used as the performance metrics. The W.Avg combination demonstrated overall success beyond individual models. The results showed that our ensemble model outperformed all other baseline models with R²=0.8266, RMSE=0.6568, and MSE=0.4337. Paired t-test results indicate that the W.Avg model is statistically significantly superior to the other models in terms of R², RMSE, and MSE (p < 0.05). This ensemble-based method accelerated hDHODH inhibitor discovery by reducing screening time and increasing predictive accuracy.

Kaynakça

  • Ganga, G.K.: Accelerating drug discovery targeting dihydroorotate dehydrogenase using machine learning and generative AI approaches. Comput Biol Chem, 118, 2025, p. 108443.
  • Kawatani, M. et al.: Identification of a dihydroorotate dehydrogenase inhibitor that inhibits cancer cell growth by proteomic profiling. Oncol Res, 31 (6), 2023, p. 833.
  • Leban, J. et al.: Discovery of a novel series of DHODH inhibitors by a docking procedure and QSAR refinement. Bioorg Med Chem Lett, 14 (1), 2004, p. 55–58.
  • Aqeel, I. et al.: Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease. Pharmaceuticals, 15 (11), 2022, p. 1333.
  • Mendez, D. et al.: ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res, 47 (D1), 2019, p. D930–D940.
  • Atasever, S.: Enhancing HCV NS3 Inhibitor Classification with Optimized Molecular Fingerprints Using Random Forest. Int J Mol Sci, 26 (6), 2025, p. 2680.
  • Yang, J. et al.: Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov Today, 27 (11), 2022, p. 103356.
  • Xie, L. et al.: Improvement of prediction performance with conjoint molecular fingerprint in deep learning. Front Pharmacol, 11, 2020, p. 606668.
  • lazypredict. Retrieved August 22, 2025, from https://pypi.org/project/lazypredict/
  • Lin, L. et al.: Optimized lightgbm power fingerprint identification based on entropy features. Entropy, 24 (11), 2022, p. 1558.
  • Thongthammachart, T. et al.: Land use regression model established using light gradient boosting machine incorporating the WRF/CMAQ model for highly accurate spatiotemporal PM2. 5 estimation in the central region of Thailand. Atmos Environ, 297, 2023, p. 119595.
  • White, M.D. et al.: Digital fingerprinting of microstructures. Comput Mater Sci, 218, 2023, p. 111985.
  • Uysal, F., Sonmez, R.: Bootstrap aggregated case-based reasoning method for conceptual cost estimation. Buildings, 13 (3), 2023, p. 651.
  • Akbulut, H. et al.: Machine Learning-Based Orange Quality Classification: A Hyperparameter Optimization Approach Through Puma Optimizer. In: 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (ICHORA). 2025, p. 1–5.
  • Akselrud, C.I.A.: Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty. Fish Res, 280, 2024, p. 107161.

Ağirlikli Ortalama Ensemble Tabanli Tahmin Ile Insan Dihidroorotat Dehidrojenaz Inhibitör Aktivitesinin Tahmini

Yıl 2025, Cilt: 14 Sayı: 4, 245 - 253, 30.12.2025
https://doi.org/10.46810/tdfd.1800883

Öz

Bu çalışma, hibrit moleküler parmak izleri kullanarak İnsan Dihidroorotat Dehidrojenaz (hDHODH) için pIC50 değerinin ağırlıklı ortalama ensemble tabanlı tahminine odaklanmaktadır. ChEMBL veritabanında IC50 verileri sorgulanarak 1585 molekülden oluşan çeşitli bir koleksiyon elde edilmiş ve bu değerler ensemble tabanlı tahmin modelleri geliştirmek için pIC50 değerlerine dönüştürülmüştür. pIC50 değerlerini tahmin etmek için Light Gradient Boosting Machine (LGBM), Bootstrap Aggregating (Bagging) ve Random Forest (RF) algoritmalarının ağırlıklı ortalamasını (W.Avg) kullandık. Model performansı, 5x3 tekrarlanan K-katlı çapraz doğrulama (CV) kullanılarak değerlendirildi. Performans ölçütleri olarak kök ortalama kare hatası (RMSE) ve ortalama kare hatası (MSE) kullanıldı. W.Avg kombinasyonu, bireysel modellerin ötesinde genel bir başarı gösterdi. Sonuçlar, ensemble modelimizin R²=0,8266, RMSE=0,6568 ve MSE=0,4337 ile diğer tüm temel modelleri geride bıraktığını gösterdi. Eşleştirilmiş t-testi sonuçları, W.Avg modelinin R², RMSE ve MSE açısından diğer modellere göre istatistiksel olarak anlamlı bir şekilde üstün olduğunu göstermektedir (p < 0,05). Bu ensemble tabanlı yöntem, tarama süresini kısaltarak ve tahmin doğruluğunu artırarak hDHODH inhibitörünün keşfini hızlandırmıştır.

Kaynakça

  • Ganga, G.K.: Accelerating drug discovery targeting dihydroorotate dehydrogenase using machine learning and generative AI approaches. Comput Biol Chem, 118, 2025, p. 108443.
  • Kawatani, M. et al.: Identification of a dihydroorotate dehydrogenase inhibitor that inhibits cancer cell growth by proteomic profiling. Oncol Res, 31 (6), 2023, p. 833.
  • Leban, J. et al.: Discovery of a novel series of DHODH inhibitors by a docking procedure and QSAR refinement. Bioorg Med Chem Lett, 14 (1), 2004, p. 55–58.
  • Aqeel, I. et al.: Hybrid Approach to Identifying Druglikeness Leading Compounds against COVID-19 3CL Protease. Pharmaceuticals, 15 (11), 2022, p. 1333.
  • Mendez, D. et al.: ChEMBL: towards direct deposition of bioassay data. Nucleic Acids Res, 47 (D1), 2019, p. D930–D940.
  • Atasever, S.: Enhancing HCV NS3 Inhibitor Classification with Optimized Molecular Fingerprints Using Random Forest. Int J Mol Sci, 26 (6), 2025, p. 2680.
  • Yang, J. et al.: Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov Today, 27 (11), 2022, p. 103356.
  • Xie, L. et al.: Improvement of prediction performance with conjoint molecular fingerprint in deep learning. Front Pharmacol, 11, 2020, p. 606668.
  • lazypredict. Retrieved August 22, 2025, from https://pypi.org/project/lazypredict/
  • Lin, L. et al.: Optimized lightgbm power fingerprint identification based on entropy features. Entropy, 24 (11), 2022, p. 1558.
  • Thongthammachart, T. et al.: Land use regression model established using light gradient boosting machine incorporating the WRF/CMAQ model for highly accurate spatiotemporal PM2. 5 estimation in the central region of Thailand. Atmos Environ, 297, 2023, p. 119595.
  • White, M.D. et al.: Digital fingerprinting of microstructures. Comput Mater Sci, 218, 2023, p. 111985.
  • Uysal, F., Sonmez, R.: Bootstrap aggregated case-based reasoning method for conceptual cost estimation. Buildings, 13 (3), 2023, p. 651.
  • Akbulut, H. et al.: Machine Learning-Based Orange Quality Classification: A Hyperparameter Optimization Approach Through Puma Optimizer. In: 2025 7th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (ICHORA). 2025, p. 1–5.
  • Akselrud, C.I.A.: Random forest regression models in ecology: Accounting for messy biological data and producing predictions with uncertainty. Fish Res, 280, 2024, p. 107161.
Toplam 15 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Bilgi Sistemleri (Diğer), Biyomedikal Mühendisliği (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Eyüp Sıramkaya 0000-0002-6011-7302

Sema Atasever 0000-0002-2295-7917

Gönderilme Tarihi 10 Ekim 2025
Kabul Tarihi 8 Aralık 2025
Yayımlanma Tarihi 30 Aralık 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 14 Sayı: 4

Kaynak Göster

APA Sıramkaya, E., & Atasever, S. (2025). Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction. Türk Doğa ve Fen Dergisi, 14(4), 245-253. https://doi.org/10.46810/tdfd.1800883
AMA Sıramkaya E, Atasever S. Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction. TDFD. Aralık 2025;14(4):245-253. doi:10.46810/tdfd.1800883
Chicago Sıramkaya, Eyüp, ve Sema Atasever. “Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction”. Türk Doğa ve Fen Dergisi 14, sy. 4 (Aralık 2025): 245-53. https://doi.org/10.46810/tdfd.1800883.
EndNote Sıramkaya E, Atasever S (01 Aralık 2025) Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction. Türk Doğa ve Fen Dergisi 14 4 245–253.
IEEE E. Sıramkaya ve S. Atasever, “Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction”, TDFD, c. 14, sy. 4, ss. 245–253, 2025, doi: 10.46810/tdfd.1800883.
ISNAD Sıramkaya, Eyüp - Atasever, Sema. “Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction”. Türk Doğa ve Fen Dergisi 14/4 (Aralık2025), 245-253. https://doi.org/10.46810/tdfd.1800883.
JAMA Sıramkaya E, Atasever S. Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction. TDFD. 2025;14:245–253.
MLA Sıramkaya, Eyüp ve Sema Atasever. “Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction”. Türk Doğa ve Fen Dergisi, c. 14, sy. 4, 2025, ss. 245-53, doi:10.46810/tdfd.1800883.
Vancouver Sıramkaya E, Atasever S. Prediction of Human Dihydroorotate Dehydrogenase Inhibitor Activity by a Weighted Average Ensemble-Based Prediction. TDFD. 2025;14(4):245-53.