People can get pneumonia, a dangerous infectious disease, at any time in their lives. Severe cases of pneumonia can be fatal. A doctor would usually examine chest x-rays to diagnose pneumonia. In this work, a pneumonia diagnosis system was developed using publicly available chest x-ray images. Vision Transformer (ViT) and other deep learning models were used to extract features from these images. Vision Transformer (ViT) is an attention-based model used for image processing and understanding as an alternative to the convolutional neural networks traditionally used for this purpose. ViT consists of a series of attention layers, where each attention layer models the relationships between input pixels to represent an image. These relationships are determined by a set of attention heads and then fed into a classifier. ViT performs effectively in a variety of visual tasks, especially when trained on large datasets. The study shows that the ViT model's classification procedure has a high success rate of 95.67%. These results highlight how deep learning models can be used to quickly and accurately diagnose dangerous diseases such as pneumonia in its early stages. The study also shows that the ViT model outperforms current approaches in the biomedical field.
Primary Language | English |
---|---|
Subjects | Artificial Intelligence (Other) |
Journal Section | Engineering Practice and Education |
Authors | |
Early Pub Date | June 14, 2024 |
Publication Date | June 29, 2024 |
Submission Date | April 4, 2024 |
Acceptance Date | May 6, 2024 |
Published in Issue | Year 2024 |