Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) causes the COVID-19 disease, which turns into a pandemic and threatens public health. Appearing of SARS-CoV-2 variants show a significant challenge to determine the risk of infection, develop vaccines as well as antiviral agents, monitor the changes, and assess the evolution of SARS-CoV-2. In this study, we propose a method identifying SARS-CoV-2 variants in Turkey. To achieve this goal, nucleotide occurrences are computed from the whole genome sequences that include four nucleotides, A, C, T, and G. Thus, 30 000 bps genome sequences are represented by only four integer numbers. After features are extracted, four classification methods, support vector machines, k-nearest neighbor, neural network, and decision tree are employed to identify SARS-CoV-2 variants. Experimental results are conducted on a dataset including 1403 genome sequences from Turkey and belonging to variants of SARS-CoV-2, B.1.1.7 (Alpha), B.1.351 (Beta), P.1. (Gamma), as well as B.1.617 (Delta). Experimental results present that the KNN classifier achieves an accuracy of 0.94, a precision of 0.81, a recall of 0.80, and an F-score of 0.80 on average.
SARS-CoV-2 variants COVID-19 Coronavirus Classifiers COVID-19 variants in Turkey
Birincil Dil | İngilizce |
---|---|
Konular | Mühendislik |
Bölüm | Araştırma Makalesi |
Yazarlar | |
Yayımlanma Tarihi | 30 Haziran 2022 |
Gönderilme Tarihi | 8 Nisan 2022 |
Kabul Tarihi | 25 Nisan 2022 |
Yayımlandığı Sayı | Yıl 2022 Cilt: 6 Sayı: 1 |