Heart disease is one of the leading causes of death worldwide, making early detection and diagnosis essential for effective treatment. With advancements in machine learning (ML) and artificial intelligence (AI), these technologies are being increasingly applied in the medical field, particularly for detecting and predicting heart disease. As AI systems become more complex, it becomes important to distinguish between abstracts generated by AI algorithms and those prepared by human experts. This study aims to develop and assess ML approaches to distinguish between human-written and AI-generated (ChatGPT and NLTK) heart disease abstracts. Using a dataset of 15,000 abstracts (5,000 written by humans, 5,000 reworded by ChatGPT, and 5,000 generated using NLTK), various Natural Language Processing (NLP) techniques, such as tokenization, stop word removal, stemming and lemmatization were applied. The text data was transformed into numerical form using TF-IDF vectorization. Different ML models, including K-nearest neighbors (KNN), support vector machines (SVMs), logistic regression, random forest, decision tree were trained and tested for their classification accuracy. This study highlights the significant potential of ML techniques in ensuring transparency and reliability in AI-driven medical decision-making, especially in the area of heart disease diagnosis.
Primary Language | English |
---|---|
Subjects | Natural Language Processing |
Journal Section | Original Research Articles |
Authors | |
Early Pub Date | September 1, 2025 |
Publication Date | October 8, 2025 |
Submission Date | June 30, 2025 |
Acceptance Date | August 11, 2025 |
Published in Issue | Year 2025 Volume: 8 Issue: 2 |