Prediction of absorption dose of radiation on Thorax CT imaging in geriatric patients with COVID-19 by classification algorithms

: Objective : The aim of the study is to predict the absorbed radiation dose on thorax CT imaging in geriatric patients with COVID-19. Materials and Method : The SIEMENS SENSATION 64 CT scanner was performed with real protocols to patients (male/female phantom) using Monte Carlo simulation methods with the patient’s real height and weight nts and the actual parameters CT scanner. Absorbed organ doses have been calculated based on these Monte Carlo results. These results were used to predict the optimal absorbed radiation dose by Artificial Neural Network, Linear Discriminant Analysis, Random Forest Classification, and Naive-Bayes Classification algorithms. The dose values were clustered for genders by the Fuzzy C-Means algorithm. Results : The ages of the patients were between 60 and 70 years. The Body Mass Indexes of male and female patients were 26.11±4.49 and 25.03±4.86 kg/m 2 respectively. All classification algorithms, mentioned in the material section, were validated with approximately 100% success. The Fuzzy C-Means technique was found to be successful in clustering the dose values for gender clusters. Conclusion : While the predicted and the observed values of patients do not change in the organs/tissues around and outside of the thorax, they generally vary in the intra-thoracic organs and tissues. It can be concluded that data-driven techniques are useful to obtain optimal radiation doses for organs/tissues in CT imaging.


Introduction
COVID-19 heavily relies on radiological tests, particularly computed tomography (CT).It is still necessary and essential to conduct a thorough and timely study of the radiological role in combating COVID-19.Chest CT plays an important role in the early detection of lung infection.CT has become an important imaging modality for the early diagnosis of patients with COVID-19 pneumonia [1].
Chest CT may have higher sensitivity than repetitive Reverse Transcription-Polymerase Chain Reaction (RT-PCR) testing for the diagnosis of COVID-19 because RT-PCR can be affected by low patient viral load and inappropriate clinical sampling.Bilateral, multifocal, ground glass opacities (GGO), patchy consolidation, and periph-eral, and subpleural distribution are typical radiological characteristics of COVID-19, especially in the lower lobes.
If there is a high clinical likelihood that COVID-19 is the cause, repeated RT-PCR testing and chest computed tomography (CT) scanning may be combined to make the diagnosis.Chest CT can improve COVID-19 diagnosis sensitivity, but radiation exposure to patients should be minimized, especially for young children and pregnant women [2].
Radiology is very important in the management of patients in intensive care units.Portable chest X-rays are frequently utilized, although, in some situations, computed tomography and ultrasound are valuable diagnos-tic tools [3].Ultrasonography works on the same principle as x-ray, tomography, and magnetic resonance because it works on the basis of echo.However, X-ray is not used in this examination method.Ultrasonography is a method that enables the visualization of internal organs by using sound waves with a frequency that is too high for the human ear to hear.Ultrasonic imaging is a non-invasive, safe, and painless technique that uses sound waves to take pictures of the body [4].The use of multiple imaging modalities on the same target in the field of biomedical imaging is expanding as more sophisticated methods and tools become available.For a variety of applications, for instance, simultaneous acquisition of computed tomography (CT) and positron emission tomography (PET) has come to be accepted in clinical practice.Together with CT and magnetic resonance imaging (MRI), which offer high contrast and spatial resolution details on anatomical structures to better characterize lesions, lesions can be more accurately characterized by the use of functional imaging methods like PET, which lack anatomical characterization but offer quantitative metabolic and functional information about illnesses [5].CT is similar to traditional x-ray radiography in that the x-ray tube and detector spin around the investigated body part.CT is a system for patients, that is monitored by the technician [6].
Unfortunately, the excessive use of radiological imaging methods exposes patients to radiation.Both the patient and the technicians should be protected from radiation as much as possible.Therefore, in recent years, many machine learning (ML) or artificial intelligence (AI) processes have been developed.Because image recognition, object detection and tracking, automatic document analysis, computational photography, augmented reality, 3D reconstruction, and medical image processing are just a few of the computer vision issues that machine learning can handle.Recent advances in powerful computing and imaging technologies in the field of biomedical engineering have opened up new avenues for research, and the expanding volume of biomedical data necessitates the use of accurate machine learning-based data mining methods [7].
This study aims to predict the absorbed radiation dose on the thorax in geriatric patients with COVID-19.The predictive model was established by both supervised and unsupervised learning algorithms.

Monte Carlo Simulation
The measurements of the radiation dose and organ absorbed dose were performed using SIEMENS SENSA-TION 64 slice CT as the foundation.Patients' genuine heights and weights as well as the SIEMENS SENSATION 64 slice CT scanner's actual parameters were measured using real patient protocols and Monte Carlo simulation techniques.On the basis of these Monte Carlo outcomes, absorption organ dosages have been computed.Using Monte Carlo simulation techniques, the SIEMENS SEN-SATION 64 CT scanner was done using genuine proto-cols on patients (male and female phantoms), using the patients' actual height and weight as well as the CT scanner's actual characteristics [Figure 1].• source-to-iso-center distance 57 cm, • total field size at iso-center 60.6 cm x 25 cm, • beamwidth at iso-center 1.9 cm, • scan length 25 cm.
Among the various X-ray tube currents, the medical imaging technician chose an x-ray tube voltage of 100 kVp, tube current, gantry rotation speeds, and tube current of 40 mAs [Figure 2].The X-ray scanner moreover makes use of the standard CT scan protocol and this feature, tube current modulation while scanning was applied to the observations in this work.CT protocol and X-ray tube feature selection (like kV, mAs, scan length, etc.) are important determinants of radiation dose and image quality in CT examinations.
The Radiation Dosimetry Group of the Department of Nuclear Energy at the Federal University of Pernambuco, Brazil, developed the MASH and FASH phantoms that are employed in our work for the estimation of comparable doses to radiosensitive organs and tissues of the human body.

Artificial Neural Network
Artificial Neural Network (ANN) can be defined as a system designed to model a machine learning predictive method like a human-brain nervous connection.ANN consists of connecting artificial nerve cells with each other in various ways and is usually arranged in layers.It can be realized with electronic circuits as hardware or as software in computers.An ANN is a parallel distributed processor that can store and generalize information after learning, according to the brain's information processing technique [8].Artificial neural networks are used to make very fast decisions under different conditions, and in solving complex problems by means of simplified models.In ANN, artificial neurons are simply clustered.The weighted input neurons (W i *X i ) in the input layer send information to the hidden layer with bias (b) as follows [9]: The output layer receives data from the hidden layer.Every neuron has a single output, weighted inputs (synapses), and an activation function.The synapses are customizable parameters that transform a neural network into a parameterized system clustering is performed in layers and then these layers are linked to each other.Therefore, one type of ANN is called the Multilayer Perceptron.Three or more layers make up a multilayer perceptron (MLP).It makes use of a nonlinear activation function (most often a hyperbolic tangent or logistic function) to distinguish data that isn't linearly separable.Every node in one layer connects to every node in the next, completing the network's connectivity.perceptron natural language processing (NLP) applications.Basically, ANN methodology can be learned from data and worked with an unlimited number of attributes [10].

Linear Discriminant Analysis
Linear Discriminant Analysis (LDA) is a method that allows the p attributes in the X data set to be divided into two or more real groups, and the newly observed units to be correctly assigned to the specified classes by means of the specified discriminant functions.In the LDA approach, the conditional probability of P(X| Y=0) and P(X| Y=1) with and mean and covariance pa-rameters can be assumed as bigger than some threshold T as follows: Under the homoscedasticity assumption.
LDA is a suitable method for cases where the performance is examined in randomly generated test sets and the frequencies within the classes are not equal.By this method, the ratio of two variances is tried to be maximized, as in the F distribution, so that the maximum separation is successfully achieved [11].

Random Forest Classification
Random Forest was developed in 2001 by Leo Breiman.
The random forest consists of a combination of bagging methods and random subspace methods proposed by Tim Kam Ho.In the random subspace method, the variable that will provide the most appropriate branching is determined by a small number of variables randomly selected among all variables [12].The RF method applies bagging learning in training.The method works as follows with a given training set X with responses Y.For b=1, …, B; choosing samples from X and Y, a classification (or regression)is trained as follows:, for an unseen sample of .
Random forests are a collection of tree-type classifiers, each tree in the forest depends on the values of an independently sampled random vector and the same distribution.It can be considered an advanced form of bagging method.Considering the application area of the random forest, it is seen that it is used for many classifications [13].

Naive-Bayes Classification
The Naive-Bayes classifier is based on the Bayes theorem.Determine with which probability the samples belong to which class.In the Naive-Bayes classifier, the attributes are all considered to be equally important and the attributes are independent of each other.The Naive-Bayes classifier chooses the class in which class the best probability is calculated [14].In summary, this classifier assumes that the attributes are independent of each other and the attributes are all equally important.Under the Bayes Theorem, the conditional probability can be written as: Then a Bayes classifier combines the model with maximum a posteriori decision rule for some k as follows: The Naive-Bayes classification algorithm is a classifier that is easy to understand, fast to use, and learn.It can be used for binary or multiple classifications.Although the independence assumption is generally unrealistic, Naive-Bayes mostly shows a good classification performance.The method struggles with lost value by giving up the sample during probability estimation calculations [15].

Fuzzy C-Means Clustering
The fuzzy c-means algorithm was first suggested by J.Bezdek et al. in 1984.The method is based on the fuzzy logic proposed by Zadeh in 1965.The fuzzy C-means (FCM) method has been developed by making improvements to the KM algorithm.In this technique, each data point can belong to multiple clusters with a degree of membership [16].Like the QM method, the FCM method is based on the reduction of the uniqueness criterion, the cost function.Unlike rigid clustering, in this method, each data point is a member of a predetermined number of clusters with a membership degree between 0 and 1.The blur constant m used in cluster center and membership calculations is an important parameter affecting the result.The value of this parameter determines the maximum turbidity limit.As the value of the turbidity constant approaches 1, the clustering result approaches solid clustering.Conversely, as the value of the turbidity constant increases, the turbidity of the processes increases, and the cluster center values of different clusters converge.This parameter also affects the convergence speed of the algorithm.The higher its value, the slower the convergence speed [17].The algorithm aims to minimize an objective function for a partition matrix attempting to partition a finite collection of n elements X into a collection of c fuzzy clusters.

Technical Statistical Analysis
The analyses of the study were performed by SPSS 20.0 (IBM Inc, Chicago, IL, USA) and JASP 0.

Results
Information on 9 male and 9 female patients with different heights and weights were included in the study.
Eleven different age measurement values between 60 and 70 years were selected for various weights and heights.
A data set was created totally with measurements of 198 patients (Male:Female ratio 1:1).In male patients, weights were between 59.There was a negative and significant correlation between BMI and air kerma absorption dose, absorbed dose, and CTDI values of various organs.A moderate negative correlation between oral mucosa, liver, salivary glands, lymphatic nodes, and BMI (p<0.05); a negatively strong correlation was found with breast glandular, lungs, muscle, esophagus, skin area, thymus, heart wall, skeleton average, RBM and BSC absorbed dose and weighted dose (p<0.01).
The multilayer perceptron classification method was used in artificial neural network analysis.In the model created for gender classes, 3 hidden layers were determined by optimal determination of layers according to information criteria.The prediction model obtained was found to be quite successful.The predicted gender rates were calculated as 45.7% for the male patient class and 54.3% for the female patient class.According to ROC analysis, the area under the curve (AUC) was calculated as 0.984.Moreover, the AUC for BMI was 0.947, and this showed that very successful predictive values were obtained.The sum of square errors (SSE) was 0.019 and the relative error value was 0.003.The average of the differences between the predicted and the observed measurements of BMI was calculated as -0.0019.The most important variables in the prediction model were determined as BSC AD, muscle, extrathoracic airways, weighted dose, oral mucosa, thyroid, breast glandular, and salivary glands.
Similar results were obtained for AK, AD, and CDTI measurements of gender in the classification analysis performed with the Naive-Bayes method.The classification error rate was 0% and the gain score was 30.AUC was calculated as 0.991.Class rates for male and female patients were determined as 50%.The same results were found for both training and testing sets.Among the measurements, breast glandular, skeleton average, lungs, muscle, oral mucosa, and extrathoracic airways were determined as the most important variables according to pseudo-BIC (Bayesian Information Criteria) values.
In the linear discriminant classification method, the predicted cluster ratio was calculated as 53.3% for male and 46.7% for female patients.Precision, recall, F1 score, and AUC values were calculated as 0.978.The lymphatic nodes, salivary glands, thymus, BSC AD, Weighted Dose, and heart wall among thorax organs and tissues had more significant discriminant values in classification.
In the Random Forest Classification method, the predicted rates for gender classes were calculated as 58% for male and 42% for female patients.Precision, recall, F1 score, and AUC values were calculated as 0.958.The most important variables in the prediction model were listed as lymphatic nodes, salivary glands, adrenals, oral mucosa, skeleton averages, extrathoracic airways, RBM AD, and thymus.
In all classification algorithms, the training, testing, and validation ratios were considered as 70%, 18%, and 12%, respectively.However, in all methods, the success of the algorithms was so high, and this is a sign of an overfitting problem.Therefore, a 10-fold cross-validation method, which is one of the most popular techniques to assess the accuracy of the model, was applied to solve this problem.After the application, the accuracy results decreased slightly as expected.However, the most accurate result belonged to the Naive-Bayes algorithm again (97.4%).
Other results of the methods were 91.2% for ANN, 88.7% for Linear Discriminant Analysis, and 83.5% for Random Forest classification, respectively.
Finally, the Fuzzy C-Means clustering analysis was used to test how many clusters would be formed by the measurement values and gender.According to the lowest BIC value, two clusters were decided to be suitable (Figure 3).For two clusters, R 2 = 0.411, AIC = 1983.02,BIC = 2117.12,and Silhouette Measure = 0.320 were calculated, and Silhouette measure = 0.278 for the male patient group and Silhouette Measure = 0.376 for the female patient group.Therefore, the most optimal absorbed dose values for thoracic organs or tissues were calculated [Table 4].
The AK, AD, and CDTI values obtained by the Fuzzy C-Means clustering method were presented in Table 6.
The predicted Air Kerma dose values for male and female patients were compared with the observed values.The values estimated by the brain, oral mucosa, colon wall, skin, spleen, stomach wall, salivary glands, thyroid, extrathoracic airways, skeleton average, and weighted dose measurements were found to be equal or very close to each other.The adrenals, breasts glandular, kidneys, liver, lungs, esophagus, thymus, heart wall, lymphatic nodes, maximum RBM, and BSC dose measurements were predicted lower in men and higher in women.Similarly, the predictive values of AD and CDTI values were performed, and in all three types of dose, the same pattern for genders was obtained.The predictive value for males increased while there was a decrease in female patients or vice versa.

Discussion
This study showed that comments on output variability related to the combination of two different software applications (Monte Carlo and Predictive Modeling Algorithms).The use of different methods to calculate organ dose in emergency CT scans such as Covid-19 was important for physicians, health professionals, and patients.
At the end of this study, it was desirable to summarize the current situation in predicting geriatric organ doses in CT reviews and draw a roadmap for standardized reporting of the basic parameters needed to predict organ doses from international CT imaging [18].
The software used was virtual dose measurements with Monte Carlo Simulation and organ dose measurements in geriatric patients with virtual Phantoms.In addition to the existing modulation, the geriatric organ dose results of the Monte Carlo simulation were studied with the Predictive Modeling Algorithms and evaluated statistically.
The answer to the question is why Monte Carlo simulation and Predictive Modeling Algorithms were used; difficulties in accessing the actual patient databases requested during the pandemic period.Homayounieh et al.
in their study, the study included generalizability of the findings and a relatively small number of patients (10-20 patients per site).However, this is understandable given the high workload of medical personnel at health facilities during the pandemic and the manual data collection process adopted by the study.As a result of COVID-19 literature research, the reason for the small number of studies based on actual patients is that the pandemic is still unfinished.However, the study provides valuable data on organ doses taken by geriatric patients that are not available in the literature and is a preliminary study to highlight the importance of future studies for a larger number of patients using more complex simulation and prediction techniques [19,20].
Another limitation adopted by the authors is the unknown accuracy of dose identifiers reported from participating health sites.Large-scale CT dose surveys and studies often rely on manual reporting techniques, in which individuals tend to present general screening protocols rather than those used for patients at an individual level.Direct access to electronic dose records would be the best future way to accurately estimate dose levels, and these future techniques could be sought-after studies.In summary, Homayounieh et al. reported extensive protocol changes in the use of CT and radiation doses associated with COVID-19."Do you have a specific CT protocol for COVID-19 patients?"half of the sites answered "Yes".However, in previous studies, significant differences in CT screening protocols have been observed in healthcare institutions around the world, and the current level of radiation dose is much higher than the recommended low-dose CT screening protocols [21][22][23][24].
Controlling the spread of COVID-19 requires early detection and diagnosis.Screening for COVID-19 in CT scans as a tool to assist in diagnosis has become incredibly valuable.However, studies on CT dose measurement and reporting, particularly in the age group most affected by the disease, have become extremely relevant, as there are certain risks associated with the use of CT.It will also be important to assess the risks and benefits of chest CT scans followed in the context of COVID-  [26].Like in medical imaging, transport units for patients with COVID-19 are also important due to contamination.There are some studies about the usage of AI algorithms in the production of such units.The term "CRBN" refers to hazardous and dangerous conditions brought on by chemical, nuclear, biological, and radioactive pollutants that can be accidentally or purposefully dispersed, harming both people and the environment.These hazards may result in circumstances that directly endanger the lives of many people, result in many deaths, or have a significant impact on the lives of those impacted.Both on the military and civilian levels, an isolated patient transportation capsule is an essential life-saving tool [27][28].
In conclusion, the organ dose value must be the software used and the scan area, and the body part concerned for diagnosis.Because COVID-19 is infected in the lungs, scans of the thorax region are also used to diagnose it.Because of this, the thorax region was selected and organ doses were calculated in our study.The dose coefficients   In the comparisons according to gender, it can be seen that only the measurement value of the adrenals has a higher dose value in male patients, and all other organs/ tissue measurement values are higher in female patients.Adrenals, brain, oral mucosa, lungs, skin, RBM AD, BSC AD, thymus, stomach wall, and heart wall generally differ between genders.Among these, lungs, skin, thymus, heart wall, RBM, and BSC dose values have higher values than other organs or tissues [29][30][31].
The classification algorithms are used to develop predictive models within the data science discipline.The predictive model is a type of supervised learning technique trained using observed information of classes in which the true predictive class is adjusted to model the parameters.There are many prediction models and new models are being developed day by day.Various criteria are used for the success and fit of the model.Cross-validation is primarily applied as the success criterion of the predicted classes.A confusion matrix is defined and varies accord-ing to the model, then the criteria such as accuracy, precision, F1 measure, recall, and AUC are generally calculated.Each model can work better in its specific dataset.Therefore, a single model is not used in a study.The model is established with several predictive methods and the results are compared according to the criteria.The model with higher criteria values is considered more successful.In the study, artificial neural networks, Naive Bayes, Linear Discriminant, and Random Forest Classification methods were preferred as classification (predictive) algorithms by using dose measurement values according to gender classes.Over 95% success has been achieved in all of the models, and the criterion values are usually 1.0 or close to it.The best classification result was Naive Bayes and then ANN, Random Forest, and Linear Discriminant algorithms, respectively.In addition, using the clustering algorithm, 2 clusters were determined for the data, and air kerma, CDTI, and absorbed dose centroid values were estimated for the organs/tissues of male and female patients.While the predicted and the observed values of patients do not change in the organs/tissues around and outside of the thorax, they generally vary in the intra-thoracic organs and tissues at increasing rates for males and decreasing rates for female patients.
There are, of course, some limitations in this study.The first limitation is that the study was conducted during the COVID-19 outbreak period.There was a lot of workforces and a fear of contamination, especially in pandemic intensive care units.The second limitation is difficult to obtain information from the radiological department due to the large number of patients.Another limitation is that the sample size should be, actually, larger in dealing with Data Science which needs big data usually.However, in the healthcare area, even a small sample size can be considered big data.

Figure 1 .Figure 2 .
Figure 1.Human Body Phantom for CT Scan (Siemens Sensation 64) 14.1.0version.The descriptive statistics were presented as mean±SD and frequency (percentage).The comparison of the dose measurements between genders was performed by Independent Sample t-test and One-way ANOVA for ages.Four different classification algorithms (Artificial Neural Network-Multilayer Percepteron, Linear Discriminant Classification, Random Forest Classification, and Naive-Bayes Classification) to classify the dose values for age and gender and the Fuzzy C-Means clustering method to cluster the measurements were used.The correlations between the dose values and BMI were checked by Spearman's Rho correlation analysis.The dataset was evaluated based on splitting into three sets as training (67%), validation (16.5%), and testing (16.5%) parts.In all classification methods, 10-fold cross-validation was performed, and precision, recall, F1-score, and AUC values were calculated.In the fuzzy c-means clustering method, AIC, BIC, and Silhouette Measure values were determined.P<0.05 value was considered a statistically significant result.

Figure 3 .
Figure 3.The Fuzzy C-Means clusters for gender Based on these Monte Carlo outcomes, absorption organ dosages were determined.The CT scanner's Monte Carlo simulation is based on the following criteria: , mean BMI values were found to be close to each other (26.11 ± 4.49 and 25.03 ± 4.86 kg/m 2 , respectively).
3 and 108 kg, and heights were between 167 and 185 cm.The data varying between 59.6-94 kg and 155.5-173 cm were obtained in female patients.While the mean weight was 81.11±15.71kg in male patients and 68.13±14.98kg in female patients.The mean height was calculated as 176.11±7.71and164.72±6.99cm,respectively.Although the weight and height of male pa-tients were higherthymus, heart wall, lymphatic nodes, skeleton averages, RBM, and BSC absorbed doses values were significantly higher in the females.Similarly, for CDTI volume measurements, only the adrenal value is significantly higher in men, brain, oral mucosa, lungs, skin area, stomach wall, salivary glands, thymus, heart wall, lymphatic nodes, skeleton averages, RBM, and BSC absorbed dose values in female patients were observed significantly higher [Table1-3].

Table 1 .
Absorbed Dose of Radiation per Incident Air Kerma of organs/tissues between genders *: significant at 0.05 level according to Independent Sample t-test

Table 2 .
Absorbed Dose of Radiation of organs/tissues between gendersPrediction of absorption dose of radiation on Thorax CT imaging in geriatric patients with COVID-19 by classification algorithms *: significant at 0.05 level according to Independent Sample t-test European Mechanical Science (2023), 7(2): 89-98

Table 3 .
Absorbed Dose of Radiation per CDTI of organs/tissues between genders *: significant at 0.05 level according to Independent Sample t-test Adnan Karaibrahimoğlu, Ümit Kara, Özge Kılıçoğlu, Yağmur Kara

Table 4 .
Predicted Absorption Dose of Radiation for thoracic organs/tissues for Genders by Fuzzy C-Means Clustering reported in the literature for different anatomical regions always represent a case of protocols used in clinical practice as examples.However, evaluation of potential ionizing radiation harms, particularly for patients, with individual or age groups, as in this study, should be based on the following: i) the absorbed organ or tissue radiation and the dose taken; ii) appropriate absorbed dose-response relationships; iii) the assessment of the risk situations derived from it.There are some studies on the amount of radiation absorption dose in CT imaging.There are also comparisons of measurement values obtained according to different devices.This study includes the prediction of measurements for thoracic organs or nearby organs/tissues.However, in this study, unlike the others, the measurements of geriatric patients with Covid-19 and over 60 years of age were determined.In addition, using different data science clustering algorithms, optimum predictive values were obtained for absorption doses of organs/tissues.There was no significant difference between the measurement values for different age values of the patients, and significant correlation coefficients were obtained for different BMI values.The absorption dose values of thoracic organs or tissues, especially, are correlated with BMI.