Research Article

Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach

Volume: 27 Number: 80 May 23, 2025
EN TR

Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach

Abstract

Heart disease is a global public health problem that requires in-depth analysis of extensive literature to uncover specific themes and relationships. This study aimed to identify latent themes and calculate consistencies in 5,000 heart disease-related abstracts retrieved from PubMed using topic modeling techniques. The original abstracts were paraphrased using ChatGPT and NLTK(Natural Language Toolkit), followed by extensive preprocessing, including tokenization, removal of stopped words, stemming, and lemmatization. For effective feature extraction, text data was vectorized using TF-IDF (term frequency-inverse document frequency). Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), and Non-Negative Matrix Factorization (NMF) were applied to reveal key thematic structures. Coherence scores were calculated and compared across different numbers of subjects (5 to 50) for each model and annotation method. This approach provides a valuable methodology for summarizing large amounts of information, allowing researchers to efficiently navigate the complex landscape of heart disease literature and identify critical areas of focus. The findings aim to improve understanding of heart disease and support future research in this vital area.

Keywords

References

  1. [1] World Health Organization. 2020. Cardiovascular diseases (CVDs). https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds) (Access date: 30.05.2024).
  2. [2] Guo, W., & Xu, S. 2021. A Comparative Study of Topic Modeling Methods for Topic Evolution Analysis. Journal of the Association for Information Science and Technology, 72(8), 1009-1024. DOI: 10.1002/asi.24486.
  3. [3] Vajjala, S., Majumder, B., Gupta, A., & Surana, H. 2020. Practical natural language processing: a comprehensive guide to building real-world NLP systems. O'Reilly Media, 466s.
  4. [4] Martin, G. M., Tang, S. 2022. Uncovering Hidden Patterns in Text: An Overview of Topic Modeling Techniques. ACM Computing Surveys, 54(1), pp.1-38. DOI: 10.1145/3437221.
  5. [5] Sajid, A., Jan, S., & Shah, I. A. 2017. Automatic topic modeling for single document short texts. 2017 International Conference on Frontiers of Information Technology (FIT). IEEE, pp. 1-7.
  6. [6] He, Q., Chen, B., Veldhuis, G., & He, J. 2021. Enhancing the Interpretability of Topic Modeling in Healthcare Applications. IEEE Access, 9, 18075-18084. DOI: 10.1109/ACCESS.2021.3052597
  7. [7] Blei, D.M., Ng, A.Y., & Jordan, M.I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Vol. 3, p. 993-1022. DOI: 10.1162/jmlr.2003.3.4-5.993.
  8. [8] Blei, D. M., Ng, A. Y., & Jordan, M. I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, Vol. 3, pp. 993-1022. DOI: 10.1162/jmlr.2003.3.4-5.993.

Details

Primary Language

English

Subjects

Performance Evaluation

Journal Section

Research Article

Early Pub Date

May 12, 2025

Publication Date

May 23, 2025

Submission Date

June 19, 2024

Acceptance Date

August 12, 2024

Published in Issue

Year 2025 Volume: 27 Number: 80

APA
Baştürk, B., & Onan, A. (2025). Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, 27(80), 216-223. https://doi.org/10.21205/deufmd.2025278007
AMA
1.Baştürk B, Onan A. Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach. DEUFMD. 2025;27(80):216-223. doi:10.21205/deufmd.2025278007
Chicago
Baştürk, Burcu, and Aytuğ Onan. 2025. “Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi 27 (80): 216-23. https://doi.org/10.21205/deufmd.2025278007.
EndNote
Baştürk B, Onan A (May 1, 2025) Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27 80 216–223.
IEEE
[1]B. Baştürk and A. Onan, “Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach”, DEUFMD, vol. 27, no. 80, pp. 216–223, May 2025, doi: 10.21205/deufmd.2025278007.
ISNAD
Baştürk, Burcu - Onan, Aytuğ. “Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27/80 (May 1, 2025): 216-223. https://doi.org/10.21205/deufmd.2025278007.
JAMA
1.Baştürk B, Onan A. Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach. DEUFMD. 2025;27:216–223.
MLA
Baştürk, Burcu, and Aytuğ Onan. “Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, vol. 27, no. 80, May 2025, pp. 216-23, doi:10.21205/deufmd.2025278007.
Vancouver
1.Burcu Baştürk, Aytuğ Onan. Discovering Latent Themes in Heart Disease Article Abstracts: A Topic Modeling Approach. DEUFMD. 2025 May 1;27(80):216-23. doi:10.21205/deufmd.2025278007

This journal is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

download?token=eyJhdXRoX3JvbGVzIjpbXSwiZW5kcG9pbnQiOiJmaWxlIiwicGF0aCI6IjliNTAvMDBjMi8xZmIxLzY5MjZmZDIyOGE1NzgyLjA3MzU5MTk2LnBuZyIsImV4cCI6MTc2NDE2OTMzMSwibm9uY2UiOiI2MTU1ODg1NGZlYzhkZTA1OThkNTU2NGFmYTQzYTc0YiJ9.O5b4Ex8bMlFv5797LL8VnE9YWS_X5880dfbmOp2-kc8