Research Article

Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts

Volume: 8 Number: 2 December 26, 2025

Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts

Abstract

Heart disease is one of the leading causes of death worldwide, making early detection and diagnosis essential for effective treatment. With advancements in machine learning (ML) and artificial intelligence (AI), these technologies are being increasingly applied in the medical field, particularly for detecting and predicting heart disease. As AI systems become more complex, it becomes important to distinguish between abstracts generated by AI algorithms and those prepared by human experts. This study aims to develop and assess ML approaches to distinguish between human-written and AI-generated (ChatGPT and NLTK) heart disease abstracts. Using a dataset of 15,000 abstracts (5,000 written by humans, 5,000 reworded by ChatGPT, and 5,000 generated using NLTK), various Natural Language Processing (NLP) techniques, such as tokenization, stop word removal, stemming and lemmatization were applied. The text data was transformed into numerical form using TF-IDF vectorization. Different ML models, including K-nearest neighbors (KNN), support vector machines (SVMs), logistic regression, random forest, decision tree were trained and tested for their classification accuracy. This study highlights the significant potential of ML techniques in ensuring transparency and reliability in AI-driven medical decision-making, especially in the area of heart disease diagnosis.

Keywords

References

  1. Internet: World diseases(CVDs), Health Organization, Cardiovascular https://www.who.int/news-room/fact sheets/detail/cardiovascular-diseases-(cvds), 22.09.2024.
  2. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep Learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
  3. Bhatt, A. (2020). Healthcare predictive analytics using machine learning and deep learning techniques: a survey. Journal of Electrical Systems and Information Technology, 7(2), 13–19.
  4. Russell, S., & Norvig, P. (2010). Artificial Intelligence: A Modern Approach (3rd ed.). Prentice Hall, Upper Saddle River, NJ.
  5. Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, New York.
  6. Jurafsky, D., & Martin, J. H. (2021). Speech and Language Processing (3rd ed.). Pearson, San Francisco, CA.
  7. Krittanawong, C., Zhang, H., Wang, Z., et al. (2017). Artificial Intelligence in Precision Cardiovascular Medicine. Journal of the American College of Cardiology, 69(21), 2657–2664. https://doi.org/10.1016/j.jacc.2017.03.571
  8. Ouyang, D., He, B., Ghorbani, A., Yuan, N., Ebinger, J., Langlotz, P., Heidenreich, P. A., Harrington, R. A., Liang, D. H., Ashley, E. A., & Zou, J. Y. (2020). Video-based AI for beat-to-beat assessment of cardiac function. Nature, 580(7802), 252–256. https://doi.org/10.1038/s41586-020-2145-8

Details

Primary Language

English

Subjects

Natural Language Processing

Journal Section

Research Article

Early Pub Date

September 1, 2025

Publication Date

December 26, 2025

Submission Date

June 30, 2025

Acceptance Date

August 11, 2025

Published in Issue

Year 2025 Volume: 8 Number: 2

APA
Baştürk, B., & Onan, A. (2025). Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts. Scientific Journal of Mehmet Akif Ersoy University, 8(2), 57-70. https://doi.org/10.70030/sjmakeu.1730246
AMA
1.Baştürk B, Onan A. Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts. Techno-Science. 2025;8(2):57-70. doi:10.70030/sjmakeu.1730246
Chicago
Baştürk, Burcu, and Aytuğ Onan. 2025. “Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts”. Scientific Journal of Mehmet Akif Ersoy University 8 (2): 57-70. https://doi.org/10.70030/sjmakeu.1730246.
EndNote
Baştürk B, Onan A (December 1, 2025) Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts. Scientific Journal of Mehmet Akif Ersoy University 8 2 57–70.
IEEE
[1]B. Baştürk and A. Onan, “Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts”, Techno-Science, vol. 8, no. 2, pp. 57–70, Dec. 2025, doi: 10.70030/sjmakeu.1730246.
ISNAD
Baştürk, Burcu - Onan, Aytuğ. “Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts”. Scientific Journal of Mehmet Akif Ersoy University 8/2 (December 1, 2025): 57-70. https://doi.org/10.70030/sjmakeu.1730246.
JAMA
1.Baştürk B, Onan A. Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts. Techno-Science. 2025;8:57–70.
MLA
Baştürk, Burcu, and Aytuğ Onan. “Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts”. Scientific Journal of Mehmet Akif Ersoy University, vol. 8, no. 2, Dec. 2025, pp. 57-70, doi:10.70030/sjmakeu.1730246.
Vancouver
1.Burcu Baştürk, Aytuğ Onan. Machine Learning Based Identification of LLM Generated Scientific Research Article Abstracts. Techno-Science. 2025 Dec. 1;8(2):57-70. doi:10.70030/sjmakeu.1730246