Conference Paper

Data Cleaning in Medical Procurement Database: Performance Comparison of Data Mining Classification Algorithms for Tackling Missing Value

Volume: 23 September 30, 2023
  • Amarawan Pentrakan
  • Arbee L. P. Chen
EN

Data Cleaning in Medical Procurement Database: Performance Comparison of Data Mining Classification Algorithms for Tackling Missing Value

Abstract

Data cleaning is an important process for improving the quality of decision-making information. One of today's popular cleaning tools is data mining techniques. In this paper, we focused on using data mining classification algorithms to resolve missing values in medical purchasing databases. To serve this purpose, the predictive performance of four different classifiers: Decision Tree, Naïve Bayes, K-Nearest Neighbor, and Support Vector Machine (SVM) were compared in this study. We used 2,311 medical data records from procurement database in Thailand between July 2019 and December 2019 in the experimental process. We also discussed the function of feature selection and test options that support analysis to improve model performance. The results showed that the SVM algorithm outperforms with a maximum accuracy of 89.61%. Additionally, we discussed the strengths and weaknesses of these data mining techniques for data cleaning and future research.

Keywords

References

  1. Batt, S., Grealis, T., Harmon, O., & Tomolonis, P. (2020). Learning Tableau: A data visualization tool. The Journal of Economic Education, 51(3-4), 317-328.
  2. Bouckaert, R. R. (2008). Bayesian network classifiers in weka for version 3-5-7. Artificial Intelligence Tools, 11(3), 369-387.
  3. Browne, M. W. (2000). Cross-validation methods. Journal of Mathematical Psychology, 44(1), 108-132.
  4. Chabot, C., Stolte, C., & Hanrahan, P. (2003). Tableau software. Tableau Software, 6.

Details

Primary Language

English

Subjects

Software Testing, Verification and Validation

Journal Section

Conference Paper

Authors

Amarawan Pentrakan This is me
Thailand

Arbee L. P. Chen This is me
Taiwan

Early Pub Date

September 9, 2023

Publication Date

September 30, 2023

Submission Date

May 30, 2023

Acceptance Date

July 27, 2023

Published in Issue

Year 2023 Volume: 23

APA
Pentrakan, A., & Chen, A. L. P. (2023). Data Cleaning in Medical Procurement Database: Performance Comparison of Data Mining Classification Algorithms for Tackling Missing Value. The Eurasia Proceedings of Science Technology Engineering and Mathematics, 23, 26-33. https://doi.org/10.55549/epstem.1357602