Research Article
BibTex RIS Cite

An explainable prediction model for drug-induced interstitial pneumonitis

Year 2025, Volume: 29 Issue: 1, 322 - 334

Abstract

Drug-induced interstitial pneumonitis (DIP) is an inflammation of the lung interstitium, emerging due to the
pneumotoxic effects of pharmaceuticals. The diagnosis is challenging due to nonspecific clinical presentations and
limited testing. Therefore, identifying the risk of drug-related pneumonitis is required during the early phases of drug
development. This study aims to estimate DIP using binary quantitative structure-toxicity relationship (QSTR) models.
The dataset was composed of 468 active pharmaceutical ingredients (APIs). Five critical modeling descriptors were
chosen. Then, four machine-learning (ML) algorithms were conducted to build prediction models with the selected
molecular identifiers. The developed models were validated using the internal 10-fold cross-validation and external test
set. The Logistic Regression (LR) algorithm outperformed all other models, achieving 95.72% and 94.68% accuracy in
internal and external validation, respectively. Additionally, the individual effect of each descriptor on the model output
was determined using the SHapley Additive exPlanations (SHAP) approach. This analysis indicated that the
pneumonitis effects of drugs might predominantly be attributed to their atomic masses, polarizabilities, van der Waals
volumes, surface areas, and electronegativities. Apart from the strong model performance, the SHAP local explanations
can assist molecular modifications to reduce or avoid the risk of pneumonitis for each molecule in the test set.
Contributing to the drug safety profile, the current classification model can guide advanced pneumotoxicity testing and
reduce late-stage failures in drug development.

References

  • [1] Skeoch S, Weatherley N, Swift AJ, Oldroyd A, Johns C, Hayton C, Giollo A, Wild JM, Waterton JC, Buch M, Linton K, Bruce IN, Leonard C, Bianchi S, Chaudhuri N. Drug-induced interstitial lung disease: a systematic review. J Clin Med. 2018; 7(10): 356. https://doi.org/10.3390/jcm7100356.
  • [2] Jaganathan K, Tayara H, Chong KT. An explainable supervised machine learning model for predicting respiratory toxicity of chemicals using optimal molecular descriptors. Pharmaceutics. 2022; 14(4): 832. https://doi.org/10.3390/pharmaceutics14040832.
  • [3] Antoine MH, Mlika M. Interstitial Lung Disease. Treasure Island (FL): StatPearls Publishing. 2024. https://www.ncbi.nlm.nih.gov/books/NBK541084/ (accessed August 30, 2024).
  • [4] Spagnolo P, Bonniaud P, Rossi G, Sverzellati N, Cottin V. Drug-induced interstitial lung disease. Eur Respir J. 2022; 60(4): 2102776. https://doi.org/10.1183/13993003.02776-2021.
  • [5] Interstitial Lung Diseases. https://www.nhlbi.nih.gov/health/interstitial-lung-diseases (accessed September 09, 2024).
  • [6] Pneumotox Website. https://www.pneumotox.com/drug/index/ (accessed August 30, 2024).
  • [7] Fujimoto D, Kato R, Morimoto T, Shimizu R, Sato Y, Kogo M, Ito J, Teraoka S, Nagata K, Nakagawa A, Otsuka K, Tomii K. Characteristics and prognostic impact of pneumonitis during systemic anti-cancer therapy in patients with advanced non-small cell lung cancer. PLoS One. 2016; 11(12): e016846. https://doi.org/10.1371/journal.pone.0168465.
  • [8] Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016; 44(D1): D1075–D1079. https://doi.org/10.1093/nar/gkv1075. http://sideeffects.embl.de/ (accessed August 30, 2024).
  • [9] Wang Z, Zhao P, Zhang X, Xu X, Li W, Liu G, Tang Y. In silico prediction of chemical respiratory toxicity via machine learning. Comput Toxicol. 2021; 18: 100155. https://doi.org/10.1016/j.comtox.2021.100155.
  • [10] Lei T, Chen F, Liu H, Sun H, Kang Y, Li D, Li Y, Hou T. ADMET evaluation in drug discovery. Part 17: Development of quantitative and qualitative prediction models for chemical-induced respiratory toxicity. Mol Pharm. 2017; 14: 2407–2421. https://doi.org/10.1021/acs.molpharmaceut.7b00317.
  • [11] Food and Drug Administration (FDA). https://www.fda.gov/ (accessed August 30, 2024).
  • [12] Shapley LS. A value for n-person games. In: Kuhn HW, Tucker AW. (Eds). Contributions to the Theory of Games, 2; Princeton University Press: Princeton, NJ, USA, 1953, pp. 307-317. https://doi.org/10.1515/9781400881970-018.
  • [13] Yap CW. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem. 2011; 32(7): 1466-1474. https://doi.org/10.1002/jcc.21707.
  • [14] Frank E, Hall MA, Witten IH. The WEKA Workbench. Online Appendix For “Data Mining: Practical Machine Learning Tools and Techniques”, fourth ed., Morgan Kaufmann, Burlington, MA, USA 2016.
  • [15] Nantasenamat C, Li H, Mandi P, Worachartcheewan A, Monnor T, Isarankura-Na-Ayudhya C, Prachayasittikul V. Exploring the chemical space of aromatase inhibitors. Mol Divers. 2013; 17: 661-677. https://doi.org/10.1007/s11030-013-9462-x.
  • [16] Pradeep P, Judson R, DeMarini DM, Keshava N, Martin TM, Dean J, Gibbons CF, Simha A, Warren SH, Gwinn MR, Patlewicz G. Evaluation of existing QSAR models and structural alerts and development of new ensemble models for genotoxicity using a newly compiled experimental dataset. Comput Toxicol. 2021; 18: 100167. https://doi.org/10.1016/j.comtox.2021.100167.
  • [17] Le Cessie S, van Houwelingen JC. Ridge estimators in logistic regression. Appl Stat. 1992; 41(1): 191-201. https://doi.org/10.2307/2347628.
  • [18] John GH, Langley P. Estimating continuous distributions in Bayesian Classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo. 1995, pp. 338-345.
  • [19] Cleary JG, Trigg LE. K*: An instance-based learner using an entropic distance measure. In: 12th International Conference on Machine Learning. Mach Learn Proceed. 1995; 108-114. https://doi.org/10.1016/b978-1-55860-377 6.50022-0.
  • [20] Aha D, Kibler D. Instance-based learning algorithms. Mach Learn. 1991; 6: 37-66. https://doi.org/10.1007/BF00153759.
  • [21] Organisation for Economic Co-Operation and Development (OECD), 2014. Guidance document on the validation of (quantitative) structure-activity relationship [(Q) SAR] models, OECD Series on testing and assessment, No. 69, OECD Publishing, Paris. https://doi.org/10.1787/10.1787/9789264085442-en (accessed August 12, 2024).
  • [22] Vogt M, Bajorath J. Modeling tanimoto similarity value distributions and predicting search results. Mol Inform. 2017; 36(7): 1600131. https://doi.org/10.1002/minf.201600131.
  • [23] Medina-Franco JL, Martínez-Mayorga K, Giulianotti MA, Houghten RA, Pinilla C. Visualization of the chemical space in drug discovery. Curr Comput-Aided Drug Des. 2008; 4(4): 322-333. https://doi.org/10.2174/157340908786786010.
  • [24] Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz’min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A. QSAR modeling: where have you been? Where are you going to? J Med Chem. 2014; 57: 4977-5010. https://doi.org/10.1021/jm4004285.
  • [25] Héberger K. Selection of optimal validation methods for quantitative structure–activity relationships and applicability domain. SAR QSAR Environ Res. 2023; 34(5): 415-434. https://doi.org/10.1080/1062936X.2023.2214871.
  • [26] Arrieta AB, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Garcia S, Gil-Lopez S, Molina D, Benjamins R, Chatila R, Herrera F. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fusion. 2020; 58: 82-115. https://doi.org/10.1016/j.inffus.2019.12.012.
  • [27] Todeschini R, Consonni V. Molecular descriptors for chemoinformatics. In: Mannhold R, Kubinyi H, Folkers G. (Eds). Methods and Principles in Medicinal Chemistry. Wiley VCH, Weinheim, 2009, pp. 27-37. https://doi.org/10.1002/9783527628766.
  • [28] Moreira-Filho JT, Ranganath D, Conway M, Schmitt C, Kleinstreuer N, Mansouri K. Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow. J Cheminform. 2024; 16: 101. https://doi.org/10.1186/s13321-024-00894-1.
  • [29] Yang S, Kar S. How safe are wild-caught salmons exposed to various industrial chemicals? First ever in silico models for salmon toxicity data gaps filling. J Hazard Mater. 2024; 477: 135401. https://doi.org/10.1016/j.jhazmat.2024.135401.
  • [30] Khan K, Roy K. Ecotoxicological modelling of cosmetics for aquatic organisms: A QSTR approach. SAR QSAR Environ Res. 2017; 28(7): 567-594. https://doi.org/10.1080/1062936X.2017.1352621
  • [31] Roy K, Ghosh G. QSTR with extended topochemical atom indices. 2. fish toxicity of substituted benzenes. J Chem Inf Comput Sci. 2004; 44(2): 559-567. https://doi.org/10.1021/ci0342066.
  • [32] Roy K, Das RN. The “ETA” Indices in QSAR/QSPR/QSTR Research. In: Pharmaceutical Sciences: Breakthroughs in Research and Practice. IGI Global, Hershey, Pennsylvania, 2017, pp. 978-1011.
  • [33] Roy K, Ghosh G. Exploring QSARs with Extended Topochemical Atom (ETA) indices for modeling chemical and drug toxicity. Curr Pharm Des. 2010; 16(24): 2625-2639. https://doi.org/10.2174/138161210792389270.
  • [34] Seth A, Ojha PK, Roy, K. QSAR modeling with ETA indices for cytotoxicity and enzymatic activity of diverse chemicals. J Hazard Mater. 2020; 394: 122498. https://doi.org/10.1016/j.jhazmat.2020.122498.
  • [35] De P, Roy K. Greener chemicals for the future: QSAR modelling of the PBT index using ETA descriptors. SAR QSAR Environ Res. 2018; 29(4): 319-337. https://doi.org/10.1080/1062936X.2018.1436086.
  • [36] Khan PM, Lombardo A, Benfenati E, Roy K. First report on chemometric modeling of hydrolysis half-lives of organic chemicals. Environ Sci Pollut Res Int. 2021 Jan;28(2):1627-1642. https://doi.org/10.1007/s11356-020-10500 0.
  • [37] Liu Z, Dang K, Gao J, Fan P, Li C, Wang H, Huan L, Xiaoni D, Yongchao G, Qian A. Toxicity prediction of 1, 2, 4 triazoles compounds by QSTR and interspecies QSTTR models. Ecotoxicol Environ Saf. 2022; 242: 113839. https://doi.org/10.1016/j.ecoenv.2022.113839.
  • [38] Houdou A, El Badisy I, Khomsi K, Andrade S. Interpretable machine learning approaches for forecasting and predicting air pollution: A systematic review. Aerosol Air Qual Res. 2024; 24(1): 230151. https://doi.org/10.4209/aaqr.230151.
  • [39] Alves V, Muratov E, Capuzzi S, Politi R, Low Y, Braga R., Zakharov AV, Sedykh A, Mokshyna E, Farag S, Andrade C, Kuz’min C, Fourches D, Tropsha A. Alarms about structural alerts. Green Chem. 2016; 18(16): 4348-4360. https://doi.org/10.1039/C6GC01492E.
  • [40] Han M, Jin B, Liang J, Huang C, Arp HPH. Developing machine learning approaches to identify candidate persistent, mobile and toxic (PMT) and very persistent and very mobile (vPvM) substances based on molecular structure. Water Res. 2023; 244: 120470. https://doi.org/10.1016/j.watres.2023.120470.
  • [41] Shavalieva G, Papadokonstantakis S, Peters G. Prior knowledge for predictive modeling: The case of acute aquatic toxicity. J Chem Inf Model. 2022;62(17):4018-4031. https://doi.org/10.1021/acs.jcim.1c01079.
  • [42] Pal R, Patra SG, Chattaraj PK. Quantitative structure–toxicity relationship in bioactive molecules from a conceptual DFT perspective. Pharmaceuticals. 2022; 15(11): 1383. https://doi.org/10.3390/ph15111383.
  • [43] Bassan A, Alves VM, Amberg A, Anger LT, Beilke L, Bender A, Bernal A, Cronin MTD, Hsieh JH, Johnson C, Kemper R, Mumtaz M, Neilson L, Pavan M, Pointon A, Pletz J, Ruiz P, Russo DP, Sabnis Y, Sandhu R, Schaefer M, Stavitskaya L, Szabo DT, Valentin JP, Woolley D, Zwickl C, Myatt GJ. In silico approaches in organ toxicity hazard assessment: Current status and future needs for predicting heart, kidney and lung toxicities. Comput Toxicol. 2021; 20: 100188. https://doi.org/10.1016/j.comtox.2021.100188.
  • [44] Wehr MM, Sarang SS, Rooseboom M, Boogaard PJ, Karwath A, Escher SE. RespiraTox–development of a QSAR model to predict human respiratory irritants. Regul Toxicol Pharmacol. 2022; 128: 105089. https://doi.org/10.1016/j.yrtph.2021.105089.
  • [45] Zhang H, Ma JX, Liu CT, Ren JX, Ding L. Development and evaluation of in silico prediction model for drug induced respiratory toxicity by using Naïve Bayes classifier method. Food Chem Toxicol. 2018; 121: 593–603. https://doi.org/10.1016/j.fct.2018.09.051.
  • [46] Mekenyan O, Patlewicz G, Kuseva C, Popova I, Mehmed A, Kotov S, Zhechev T, Pavlov T, Temelkov S, Roberts DW. A mechanistic approach to modeling respiratory sensitization. Chem Res Toxicol. 2014; 27(2): 219-239. https://doi.org/10.1021/tx400345b.
  • [47] Jarvis J, Seed MJ, Stocks SJ, Agius RM. A refined QSAR model for prediction of chemical asthma hazard. Occup Med (Lond). 2015; 65(8): 659–666. https://doi.org/10.1093/occmed/kqv105.
  • [48] Hosoya J, Tamura K, Muraki N, Okumura H, Ito T, Maeno M. A novel approach for a toxicity prediction model of environmental pollutants by using a quantitative structure‐activity relationship method based on toxicogenomics. ISRN Toxicol. 2011; 2011(1): 515724. https://doi.org/10.5402/2011/515724.
  • [49] Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoeaker BA, Thiessen PA, Yu B, Zaslavsky L, Zhang J, Bolton EE. PubChem 2023 update. Nucleic Acids Res. 2023; 51(D1): D1373-D1380. https://doi.org/10.1093/nar/gkac956.
  • [50] Fan C, Chen M, Wang X, Wang J, Huang B. A review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Front Energy Res. 2021; 9: 652801. https://doi.org/10.3389/fenrg.2021.652801.
  • [51] Python 3.9.5. Software. https://www.python.org/ (accessed August 30, 2024).
  • [52] Iranzad R, Liu X. A review of random forest-based feature selection methods for data science education and applications. Int J Data Sci Anal. 2024; 1-15. https://doi.org/10.1007/s41060-024-00509-w.
  • [53] Rodgers JL, Nicewander WA. Thirteen ways to look at the correlation coefficient. The American Statistician. 1988; 42(1): 59-66. https://doi.org/10.1080/00031305.1988.10475524.
  • [54] Freedman D, Pisani R, Purves R. Statistics (International Student Edition), fourth ed., WW Norton & Company: New York, USA 2007.
  • [55] Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017; 30. https://doi.org/10.48550/arXiv.1705.07874.
  • [56] Tjoa E, Guan C. A survey on explainable artificial intelligence (XAI): Toward medical XAI, IEEE Trans. Neural Netw Learn Syst. 2020; 32(11): 4793-4813. https://doi.org/10.1109/TNNLS.2020.3027314.
  • [57] Sun J, Sun CK, Tang YX, Liu TC, Lu CJ. Application of SHAP for explainable machine learning on age-based subgrouping mammography questionnaire data for positive mammography prediction and risk factor identification. Healthcare (Basel). 2023; 11(14): 2000. https://doi.org/10.3390/healthcare11142000.
There are 57 citations in total.

Details

Primary Language English
Subjects Pharmaceutical Toxicology
Journal Section Articles
Authors

Feyza Kelleci Çelik

Sezen Yılmaz Sarıaltın

Publication Date
Submission Date December 1, 2024
Acceptance Date December 17, 2024
Published in Issue Year 2025 Volume: 29 Issue: 1

Cite

APA Kelleci Çelik, F., & Yılmaz Sarıaltın, S. (n.d.). An explainable prediction model for drug-induced interstitial pneumonitis. Journal of Research in Pharmacy, 29(1), 322-334.
AMA Kelleci Çelik F, Yılmaz Sarıaltın S. An explainable prediction model for drug-induced interstitial pneumonitis. J. Res. Pharm. 29(1):322-334.
Chicago Kelleci Çelik, Feyza, and Sezen Yılmaz Sarıaltın. “An Explainable Prediction Model for Drug-Induced Interstitial Pneumonitis”. Journal of Research in Pharmacy 29, no. 1 n.d.: 322-34.
EndNote Kelleci Çelik F, Yılmaz Sarıaltın S An explainable prediction model for drug-induced interstitial pneumonitis. Journal of Research in Pharmacy 29 1 322–334.
IEEE F. Kelleci Çelik and S. Yılmaz Sarıaltın, “An explainable prediction model for drug-induced interstitial pneumonitis”, J. Res. Pharm., vol. 29, no. 1, pp. 322–334.
ISNAD Kelleci Çelik, Feyza - Yılmaz Sarıaltın, Sezen. “An Explainable Prediction Model for Drug-Induced Interstitial Pneumonitis”. Journal of Research in Pharmacy 29/1 (n.d.), 322-334.
JAMA Kelleci Çelik F, Yılmaz Sarıaltın S. An explainable prediction model for drug-induced interstitial pneumonitis. J. Res. Pharm.;29:322–334.
MLA Kelleci Çelik, Feyza and Sezen Yılmaz Sarıaltın. “An Explainable Prediction Model for Drug-Induced Interstitial Pneumonitis”. Journal of Research in Pharmacy, vol. 29, no. 1, pp. 322-34.
Vancouver Kelleci Çelik F, Yılmaz Sarıaltın S. An explainable prediction model for drug-induced interstitial pneumonitis. J. Res. Pharm. 29(1):322-34.