Araştırma Makalesi
BibTex RIS Kaynak Göster

Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods

Yıl 2025, Cilt: 6 Sayı: 1, 1 - 14, 19.06.2025
https://doi.org/10.55546/jmm.1571384

Öz

Radiology reports are essential for clinical decision-making and diagnosis, containing complex and detailed information. However, their unstructured nature makes efficient processing and analysis challenging, increasing the workload of healthcare professionals and slowing down clinical workflows. Natural Language Processing (NLP) techniques provide effective solutions by extracting meaningful information from such texts, reducing expert workload, and expediting decision-making processes. This study focuses on Named Entity Recognition (NER) in chest radiology reports using the RadGraph dataset, annotated with four tag types. The objective is to compare the performance of two NLP models—BERT (Bidirectional Encoder Representations from Transformers) and LSTM (Long Short-Term Memory) —to identify the most suitable approach for clinical data. Various training parameters, including learning rate, optimization algorithm, and input size, were optimized to enhance model performance. To address the class imbalance in the dataset, data augmentation techniques were applied, and both models were fine-tuned. The results revealed that BERT, leveraging its attention mechanism, demonstrated superior performance in identifying complex terms and entities, outperforming LSTM in accuracy, precision, recall, and F1 score. While LSTM effectively captured long-term dependencies, it required longer training times. This research highlights the potential of NLP in automating the extraction of clinical entities from radiology reports. It provides valuable insights for optimizing models and developing clinical decision support systems, ultimately aiming to enhance the efficiency of healthcare workflows.

Destekleyen Kurum

TÜBİTAK

Proje Numarası

1649B022405236

Teşekkür

This project was supported with application number 1649B022405236 within the scope of TÜBİTAK 2210-C Priority areas scholarship program.

Kaynakça

  • Abuzayed A., Al-Khalifa H., Sarcasm and sentiment detection in Arabic tweets using BERT-based models and data augmentation. In Proceedings of the sixth Arabic natural language processing workshop 312-317, 2021.
  • Banerjee I., Ling Y., Chen M. C., Hasan S. A., Langlotz C. P., Moradzadeh N., Chapman B., Amrhein T., Mong D., Rubin D. L., Farri O., Lungren M. P., Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artificial Intelligence in Medicine 97, 79–88, 2019. https://doi.org/10.1016/j.artmed.2018.11.004
  • Brasoveanu A. M. P., Andonie R., Visualizing Transformers for NLP: A Brief Survey, 24th International Conference Information Visualisation (IV), Melbourne/Australia, September 07-11, 2020, pp: 270–279. https://doi.org/10.1109/IV51561.2020.00051
  • Choi H., Kim J., Joe S., Gwon Y., Evaluation of bert and albert sentence embedding performance on downstream nlp tasks, In 2020 25th International conference on pattern recognition (ICPR), Milan/Italy, January 10-15, 2021, pp: 5482-5487. 10.1109/ICPR48806.2021.9412102
  • Cornegruta S., Bakewell R., Withey S., Montana G., Modelling radiological language with bidirectional long short-term memory networks. arXiv preprint arXiv:1609.08409, 2016.
  • Houlsby N., Giurgiu A., Jastrzebski S., Morrone B., De Laroussilhe Q., Gesmundo A., Gelly S., Parameter-efficient transfer learning for NLP. 36th International Conference on Machine Learning, Long Beach/California, 2019, pp: 2790-2799. https://doi.org/10.1007/978-3-030-77211-6_12
  • Jain S., Agrawal A., Saporta A., Truon S. Q., Duong D. N., Bui T., Rajpurkar P., Radgraph: Extracting clinical entities and relations from radiology reports. arXiv preprint arXiv:2106.14463, 2021.
  • Lamproudis A., Henriksson A., Dalianis H., Developing a clinical language model for Swedish: continued pretraining of generic BERT with in-domain data, In International Conference Recent Advances in Natural Language Processing (RANLP'21), Shoumen, September 1-3, 2021, pp: 790-797, 2021.
  • Liu J., Chen Y., Xu J., Low-Resource NER by Data Augmentation with Prompting, Thirty-First International Joint Conference on Artificial Intelligence, July 23-29, 2022, pp: 4252-4258.
  • López-Úbeda P., Díaz-Galiano M. C., Martín-Noguerol T., Luna A., Ureña-López L. A., Martín-Valdivia M. T., COVID-19 detection in radiological text reports integrating entity recognition. Computers in Biology and Medicine 127, 104066, 2020. https://doi.org/10.1016/j.compbiomed.2020.104066
  • López-Úbeda P., Martín-Noguerol T., Luna A., Automatic classification and prioritisation of actionable BI-RADS categories using natural language processing models. Clinical Radiology 79(1), e1-e7, 2024. https://doi.org/10.1016/j.crad.2023.09.009
  • Nag P. K., Bhagat A., Priya R. V., Khare D. kumar. Emotional Intelligence Through Artificial Intelligence: NLP and Deep Learning in the Analysis of Healthcare Texts, arXiv preprint arXiv: 2403.09762, 2024. http://arxiv.org/abs/2403.09762
  • Nishio M., Matsunaga T., Matsuo H., Nogami M., Kurata Y., Fujimoto K., Sugiyama O., Akashi T., Aoki S., Murakami T., Fully automatic summarization of radiology reports using natural language processing with large language models. Informatics in Medicine Unlocked 46, 101465, 2024. https://doi.org/10.1016/j.imu.2024.101465
  • Pereira S. C., Mendonça A. M., Campilho A., Sousa P., Lopes C. T., Automated image label extraction from radiology reports—A review. Artificial Intelligence in Medicine 149, 102814, 2024. https://doi.org/10.1016/j.artmed.2024.102814
  • RadGraph Dataset. Last Access Date: 13 Haziran 2024 from https://physionet.org/content/radgraph/1.0.0/ Rahali A., Akhloufi M. A., End-to-End Transformer-Based Models in Textual-Based NLP. AI, 4(1), 54–110, 2023. https://doi.org/10.3390/ai4010004
  • Rahman M. H., Islam M. S., Jowel M. M. U., Hasan M. M., Latif S., Classification of Book Review Sentiment in Bangla Language Using NLP, Machine Learning and LSTM, 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur/India, July 06-08, 2021, IEEE- 51525. https://doi.org/10.1109/ICCCNT51525.2021.9580116
  • Rani S., Jain A., Kumar A., Yang G., CCheXR-Attention: Clinical concept extraction and chest x-ray reports classification using modified Mogrifier and bidirectional LSTM with multihead attention. International Journal of Imaging Systems and Technology, 34(1), 1-15, 2024. https://doi.org/10.1002/ima.23025
  • Sun Z., Lin M., Zhu Q., Xie Q., Wang F., Lu Z., Peng Y., A scoping review on multimodal deeplearning in biomedical images and texts. Journal of Biomedical Informatics 146, 104482, 2023. https://doi.org/10.1016/j.jbi.2023.104482
  • Tarwani K. M., Edem S., Survey on Recurrent Neural Network in Natural Language Processing. International Journal of Engineering Trends and Technology 48(6), 301-304, 2017. https://doi.org/10.14445/22315381/IJETT-V48P253
  • Thukral A., Dhiman S., Meher R., Bedi P., Knowledge graph enrichment from clinical narratives using NLP, NER, and biomedical ontologies for healthcare applications. International Journal of Information Technology, 15(1), 53-65, 2023.
  • Tokgoz M., Turhan F., Bolucu N., Can B., Tuning language representation models for classification of Turkish news, 2021 International symposium on electrical, electronics and information engineering, 2021, pp: 402-407. Turchin A., Masharsky S., Zitnik M., Comparison of BERT implementations for natural language processing of narrative medical documents. Informatics in Medicine Unlocked 36, 101139, 2023. https://doi.org/10.1016/j.imu.2022.101139
  • Uskaner Hepsağ P., Özel S. A., Dalcı K., Yazıcı A., Using BERT models for breast cancer diagnosis from Turkish radiology reports. Language Resources and Evaluation, 58, 981-1012 2024. https://doi.org/10.1007/s10579-023-09669-w
  • Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., Polosukhin I. Attention Is All You Need, arXiv preprint arXiv: 1706.03762, 2017. http://arxiv.org/abs/1706.03762
  • Wang M., Hu F., The application of nltk library for python natural language processing in corpus research. Theory and Practice in Language Studies 11(9), 1041-1049, 2021. https://doi.org/10.17507/tpls.1109.09
  • Yamashita R., Bird K., Cheung P. Y. C., Decker J. H., Flory M. N., Goff D., Morimoto L. N., Shon A., Wentland A. L., Rubin D. L., Desser T. S., Automated Identification and Measurement Extraction of Pancreatic Cystic Lesions from Free-Text Radiology Reports Using Natural Language Processing. Radiology: Artificial Intelligence 4(2), e210092, 2022.
  • Yan A., McAuley J., Lu X., Du J., Chang E. Y., Gentili A., Hsu C. N., RadBERT: Adapting Transformer-based Language Models to Radiology. Radiology: Artificial Intelligence 4(4), e210258, 2022. https://doi.org/10.1148/ryai.210258
  • Yang Z., Dai Z., Yang Y., Carbonell J., Salakhutdinov R. R., Le Q. V., Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems, 32, 10, 2019.
  • Yuan J., Liao H., Luo R., Luo J., Automatic Radiology Report Generation Based on Multi-view Image Fusion and Medical Concept Enrichment. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11769 LNCS, 721-729, 2019. https://doi.org/10.1007/978-3-030-32226-7_80
  • Zhang X., Chen M. H., Qin Y., NLP-QA Framework Based on LSTM-RNN, 2nd International Conference on Data Science and Business Analytics (ICDSBA), Changsha/China, September 21-23, 2018, 307-311, 2018. https://doi.org/10.1109/ICDSBA.2018.00065

NLP Yöntemleri Kullanılarak Göğüs Radyolojisi Raporlarından Klinik Varlıkların Çıkarılması

Yıl 2025, Cilt: 6 Sayı: 1, 1 - 14, 19.06.2025
https://doi.org/10.55546/jmm.1571384

Öz

Klinik karar verme ve tanı koyma süreçlerinde büyük önem taşıyan radyoloji raporlarının zengin ancak karmaşık olması, verilerin verimli bir şekilde işlenmesini ve analiz edilmesini zorlaştırmaktadır. Bu durum sağlık çalışanlarının iş yükünü artırmakta ve klinik iş akışlarını yavaşlatmaktadır. Doğal Dil İşleme (NLP) teknikleri, bu tür metinlerden anlamlı bilgileri çıkarıp işleyerek değerli çözümler sunar, böylece uzman iş yükünü azaltır ve karar verme sürecini hızlandırır.
Bu çalışmada, dört farklı etiket türüyle etiketlenmiş RadGraph veri kümesini kullanarak göğüs radyolojisi raporlarından Adlandırılmış Varlık Tanıma (NER) üzerine odaklanıyoruz. Amaç, iki modelin performansını karşılaştırmaktır: BERT (Transformatörlerden Çift Yönlü Kodlayıcı Temsilleri) ve LSTM (Uzun Kısa Süreli Bellek). Klinik veriler için en etkili kombinasyonu bulmak amacıyla öğrenme oranı, optimizasyon algoritması ve girdi boyutu gibi çeşitli eğitim parametreleri test edilmiştir.
Veri kümesinin etiket dağılımındaki dengesizlik, veri artırımı yoluyla giderilmiş ve her iki modele de ince ayar yapılmıştır. Sonuçlar, BERT'i n dikkat mekanizmasıyla karmaşık terimleri ve varlıkları tanımlamada mükemmel olduğunu ve doğruluk, kesinlik, geri çağırma ve F1 puanı açısından LSTM' den daha iyi performans gösterdiğini göstermiştir. LSTM uzun vadeli bağımlılıkları iyi bir şekilde ele almasına rağmen, daha uzun eğitim sürelerine sahipti.
Bu araştırma, NLP' nin tıbbi metinlerden varlık çıkarımını otomatikleştirmedeki etkinliğinin altını çizmekte ve gelecekteki model optimizasyonu ve klinik karar destek sistemi geliştirme için değerli bilgiler sunmaktadır.

Proje Numarası

1649B022405236

Kaynakça

  • Abuzayed A., Al-Khalifa H., Sarcasm and sentiment detection in Arabic tweets using BERT-based models and data augmentation. In Proceedings of the sixth Arabic natural language processing workshop 312-317, 2021.
  • Banerjee I., Ling Y., Chen M. C., Hasan S. A., Langlotz C. P., Moradzadeh N., Chapman B., Amrhein T., Mong D., Rubin D. L., Farri O., Lungren M. P., Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artificial Intelligence in Medicine 97, 79–88, 2019. https://doi.org/10.1016/j.artmed.2018.11.004
  • Brasoveanu A. M. P., Andonie R., Visualizing Transformers for NLP: A Brief Survey, 24th International Conference Information Visualisation (IV), Melbourne/Australia, September 07-11, 2020, pp: 270–279. https://doi.org/10.1109/IV51561.2020.00051
  • Choi H., Kim J., Joe S., Gwon Y., Evaluation of bert and albert sentence embedding performance on downstream nlp tasks, In 2020 25th International conference on pattern recognition (ICPR), Milan/Italy, January 10-15, 2021, pp: 5482-5487. 10.1109/ICPR48806.2021.9412102
  • Cornegruta S., Bakewell R., Withey S., Montana G., Modelling radiological language with bidirectional long short-term memory networks. arXiv preprint arXiv:1609.08409, 2016.
  • Houlsby N., Giurgiu A., Jastrzebski S., Morrone B., De Laroussilhe Q., Gesmundo A., Gelly S., Parameter-efficient transfer learning for NLP. 36th International Conference on Machine Learning, Long Beach/California, 2019, pp: 2790-2799. https://doi.org/10.1007/978-3-030-77211-6_12
  • Jain S., Agrawal A., Saporta A., Truon S. Q., Duong D. N., Bui T., Rajpurkar P., Radgraph: Extracting clinical entities and relations from radiology reports. arXiv preprint arXiv:2106.14463, 2021.
  • Lamproudis A., Henriksson A., Dalianis H., Developing a clinical language model for Swedish: continued pretraining of generic BERT with in-domain data, In International Conference Recent Advances in Natural Language Processing (RANLP'21), Shoumen, September 1-3, 2021, pp: 790-797, 2021.
  • Liu J., Chen Y., Xu J., Low-Resource NER by Data Augmentation with Prompting, Thirty-First International Joint Conference on Artificial Intelligence, July 23-29, 2022, pp: 4252-4258.
  • López-Úbeda P., Díaz-Galiano M. C., Martín-Noguerol T., Luna A., Ureña-López L. A., Martín-Valdivia M. T., COVID-19 detection in radiological text reports integrating entity recognition. Computers in Biology and Medicine 127, 104066, 2020. https://doi.org/10.1016/j.compbiomed.2020.104066
  • López-Úbeda P., Martín-Noguerol T., Luna A., Automatic classification and prioritisation of actionable BI-RADS categories using natural language processing models. Clinical Radiology 79(1), e1-e7, 2024. https://doi.org/10.1016/j.crad.2023.09.009
  • Nag P. K., Bhagat A., Priya R. V., Khare D. kumar. Emotional Intelligence Through Artificial Intelligence: NLP and Deep Learning in the Analysis of Healthcare Texts, arXiv preprint arXiv: 2403.09762, 2024. http://arxiv.org/abs/2403.09762
  • Nishio M., Matsunaga T., Matsuo H., Nogami M., Kurata Y., Fujimoto K., Sugiyama O., Akashi T., Aoki S., Murakami T., Fully automatic summarization of radiology reports using natural language processing with large language models. Informatics in Medicine Unlocked 46, 101465, 2024. https://doi.org/10.1016/j.imu.2024.101465
  • Pereira S. C., Mendonça A. M., Campilho A., Sousa P., Lopes C. T., Automated image label extraction from radiology reports—A review. Artificial Intelligence in Medicine 149, 102814, 2024. https://doi.org/10.1016/j.artmed.2024.102814
  • RadGraph Dataset. Last Access Date: 13 Haziran 2024 from https://physionet.org/content/radgraph/1.0.0/ Rahali A., Akhloufi M. A., End-to-End Transformer-Based Models in Textual-Based NLP. AI, 4(1), 54–110, 2023. https://doi.org/10.3390/ai4010004
  • Rahman M. H., Islam M. S., Jowel M. M. U., Hasan M. M., Latif S., Classification of Book Review Sentiment in Bangla Language Using NLP, Machine Learning and LSTM, 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur/India, July 06-08, 2021, IEEE- 51525. https://doi.org/10.1109/ICCCNT51525.2021.9580116
  • Rani S., Jain A., Kumar A., Yang G., CCheXR-Attention: Clinical concept extraction and chest x-ray reports classification using modified Mogrifier and bidirectional LSTM with multihead attention. International Journal of Imaging Systems and Technology, 34(1), 1-15, 2024. https://doi.org/10.1002/ima.23025
  • Sun Z., Lin M., Zhu Q., Xie Q., Wang F., Lu Z., Peng Y., A scoping review on multimodal deeplearning in biomedical images and texts. Journal of Biomedical Informatics 146, 104482, 2023. https://doi.org/10.1016/j.jbi.2023.104482
  • Tarwani K. M., Edem S., Survey on Recurrent Neural Network in Natural Language Processing. International Journal of Engineering Trends and Technology 48(6), 301-304, 2017. https://doi.org/10.14445/22315381/IJETT-V48P253
  • Thukral A., Dhiman S., Meher R., Bedi P., Knowledge graph enrichment from clinical narratives using NLP, NER, and biomedical ontologies for healthcare applications. International Journal of Information Technology, 15(1), 53-65, 2023.
  • Tokgoz M., Turhan F., Bolucu N., Can B., Tuning language representation models for classification of Turkish news, 2021 International symposium on electrical, electronics and information engineering, 2021, pp: 402-407. Turchin A., Masharsky S., Zitnik M., Comparison of BERT implementations for natural language processing of narrative medical documents. Informatics in Medicine Unlocked 36, 101139, 2023. https://doi.org/10.1016/j.imu.2022.101139
  • Uskaner Hepsağ P., Özel S. A., Dalcı K., Yazıcı A., Using BERT models for breast cancer diagnosis from Turkish radiology reports. Language Resources and Evaluation, 58, 981-1012 2024. https://doi.org/10.1007/s10579-023-09669-w
  • Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., Polosukhin I. Attention Is All You Need, arXiv preprint arXiv: 1706.03762, 2017. http://arxiv.org/abs/1706.03762
  • Wang M., Hu F., The application of nltk library for python natural language processing in corpus research. Theory and Practice in Language Studies 11(9), 1041-1049, 2021. https://doi.org/10.17507/tpls.1109.09
  • Yamashita R., Bird K., Cheung P. Y. C., Decker J. H., Flory M. N., Goff D., Morimoto L. N., Shon A., Wentland A. L., Rubin D. L., Desser T. S., Automated Identification and Measurement Extraction of Pancreatic Cystic Lesions from Free-Text Radiology Reports Using Natural Language Processing. Radiology: Artificial Intelligence 4(2), e210092, 2022.
  • Yan A., McAuley J., Lu X., Du J., Chang E. Y., Gentili A., Hsu C. N., RadBERT: Adapting Transformer-based Language Models to Radiology. Radiology: Artificial Intelligence 4(4), e210258, 2022. https://doi.org/10.1148/ryai.210258
  • Yang Z., Dai Z., Yang Y., Carbonell J., Salakhutdinov R. R., Le Q. V., Xlnet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems, 32, 10, 2019.
  • Yuan J., Liao H., Luo R., Luo J., Automatic Radiology Report Generation Based on Multi-view Image Fusion and Medical Concept Enrichment. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11769 LNCS, 721-729, 2019. https://doi.org/10.1007/978-3-030-32226-7_80
  • Zhang X., Chen M. H., Qin Y., NLP-QA Framework Based on LSTM-RNN, 2nd International Conference on Data Science and Business Analytics (ICDSBA), Changsha/China, September 21-23, 2018, 307-311, 2018. https://doi.org/10.1109/ICDSBA.2018.00065
Toplam 29 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Derin Öğrenme, Makine Öğrenme (Diğer), Biyomedikal Bilimler ve Teknolojiler
Bölüm Araştırma Makaleleri
Yazarlar

Uçman Ergün 0000-0002-9218-2192

Sedanur Orcin 0009-0007-4345-4984

Sezin Barın 0000-0002-0394-2779

Proje Numarası 1649B022405236
Erken Görünüm Tarihi 15 Haziran 2025
Yayımlanma Tarihi 19 Haziran 2025
Gönderilme Tarihi 21 Ekim 2024
Kabul Tarihi 20 Aralık 2024
Yayımlandığı Sayı Yıl 2025 Cilt: 6 Sayı: 1

Kaynak Göster

APA Ergün, U., Orcin, S., & Barın, S. (2025). Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods. Journal of Materials and Mechatronics: A, 6(1), 1-14. https://doi.org/10.55546/jmm.1571384
AMA Ergün U, Orcin S, Barın S. Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods. J. Mater. Mechat. A. Haziran 2025;6(1):1-14. doi:10.55546/jmm.1571384
Chicago Ergün, Uçman, Sedanur Orcin, ve Sezin Barın. “Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods”. Journal of Materials and Mechatronics: A 6, sy. 1 (Haziran 2025): 1-14. https://doi.org/10.55546/jmm.1571384.
EndNote Ergün U, Orcin S, Barın S (01 Haziran 2025) Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods. Journal of Materials and Mechatronics: A 6 1 1–14.
IEEE U. Ergün, S. Orcin, ve S. Barın, “Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods”, J. Mater. Mechat. A, c. 6, sy. 1, ss. 1–14, 2025, doi: 10.55546/jmm.1571384.
ISNAD Ergün, Uçman vd. “Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods”. Journal of Materials and Mechatronics: A 6/1 (Haziran2025), 1-14. https://doi.org/10.55546/jmm.1571384.
JAMA Ergün U, Orcin S, Barın S. Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods. J. Mater. Mechat. A. 2025;6:1–14.
MLA Ergün, Uçman vd. “Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods”. Journal of Materials and Mechatronics: A, c. 6, sy. 1, 2025, ss. 1-14, doi:10.55546/jmm.1571384.
Vancouver Ergün U, Orcin S, Barın S. Extraction Of Clinical Entities from Chest Radiology Reports Using NLP Methods. J. Mater. Mechat. A. 2025;6(1):1-14.