Analysing Environmental Efficiency in AI for X-Ray Diagnosis
Abstract
The integration of AI tools into medical applications has aimed to improve the efficiency of diagnosis. The emergence of large language models (LLMs), such as ChatGPT and Claude, has expanded this integration even further despite a concern for their environmental impact. Because of LLM versatility and ease of use through APIs, these larger models are often utilised even though smaller, custom models can be used instead. In this paper, LLMs and small discriminative models are integrated into a Mendix application to detect Covid-19 in chest X-rays. These discriminative models are also used to provide knowledge bases for LLMs to improve accuracy. This provides a benchmark study of 14 different model configurations for comparison of diagnostic accuracy and environmental impact. The findings indicated that while smaller models reduced the carbon footprint of the application, the output was biased towards a positive diagnosis, and the output probabilities were lacking confidence. Meanwhile, restricting LLMs to only give probabilistic output caused poor performance in both accuracy and carbon footprint, demonstrating the risk of using LLMs as a universal AI solution. While using the smaller LLM GPT-4.1-Nano reduced the carbon footprint by 94.2% compared to the larger models, this was still disproportionate to the discriminative models; the most efficient solution was the Covid-Net model. Although it had a larger carbon footprint than other small models, its carbon footprint was 99.9% less than when using GPT-4.5-Preview, whilst achieving an accuracy of 95.5%, the highest of all models examined. This paper contributes to knowledge by comparing generative and discriminative models in Covid-19 detection as well as highlighting the environmental risk of using generative tools for classification tasks.
Keywords
Ethical Statement
Thanks
References
- Acevedo, M. M. A. (2021) Java V.S. Python in AI. UPC, Facultat d’Informàtica de Barcelona, Departament d’Arquitectura de Computadors. Available at: http://hdl.handle.net/2117/361296
- Agarwal, A. & Das, A. (2025) An Integrated Breast Cancer Predictor Leveraging Mendix and Machine Learning. in 2025 6th International Conference on Recent Advances in Information Technology (RAIT), pp. 1–6. Doi: 10.1109/RAIT65068.2025.11088907
- Ahmed, M. & Knockel, J. (2024) The Impact of Online Censorship on LLMs. Free and Open Communications on the Internet.
- Albahli, S., Ayub, N. & Shiraz, M. (2021). Coronavirus disease (COVID-19) detection using X-ray images and enhanced DenseNet. Applied Soft Computing, 110. Doi: 10.1016/j.asoc.2021.107645
- Tayebi Arasteh, S., Lotfinia, M., Bressem, K., Siepmann, R., Adams, L., Ferber, D., Kuhl, C., Kather, J.N., Nebelung, S., & Truhn, D. (2025). RadioRAG: online retrieval–augmented generation for radiology question answering. Radiology: Artificial Intelligence, 7(4). Doi: 10.1148/ryai.240476
- Aydin, O., & Karaarslan, E. (2025). OpenAI ChatGPT interprets radiological images: GPT-4 as a medical doctor for a fast check-up. arXiv preprint arXiv:2501.06269. Doi: 10.48550/arXiv.2501.06269
- Barbierato, E. & Gatti, A. (2024). Toward Green AI: A Methodological Survey of the Scientific Literature. IEEE Access, 12. Doi: 10.1109/ACCESS.2024.3360705
- Bera, K., Gupta, A., Jiang, S., Berlin, S., Faraji, N., Tippareddy, C., Chiong, I., Jones, R., Nemer, O., Nayate, A. & Tirumani, S.H. (2024). Assessing Performance of Multimodal ChatGPT-4 on an image based Radiology Board-style Examination: An exploratory study. medRxiv, 2024-01. Doi: 10.1101/2024.01.12.24301222
Details
Primary Language
English
Subjects
Bioinformatics, Artificial Intelligence (Other)
Journal Section
Research Article
Authors
Liam Kearns
*
0009-0003-6644-2987
United Kingdom
Publication Date
March 10, 2026
Submission Date
December 8, 2025
Acceptance Date
March 9, 2026
Published in Issue
Year 2026 Number: 10