Research Article

Effect of Imputation Methods in the Classifier Performance

Volume: 23 Number: 6 December 1, 2019
EN

Effect of Imputation Methods in the Classifier Performance

Abstract

Missing values in a data set present an important problem for almost any traditional and modern statistical method, since most of these methods were developed under the assumption that the data set was complete. However, in the real world no complete datasets are available and the issue of missing data is frequently encountered in veterinary field studies as in other fields. While imputation of missing data is important in veterinary field studies where data mining is newly starting to be implemented, another important issue is how it should be imputed. This is because in many studies observations with any variables having missing values are being removed or they are completed by traditional methods. In recent years, while alternative approaches are widely available to prevent removal of observations with missing values, they are being used rarely. The aim of this study is to examine mean, median, nearest neighbors, mice and missForest methods to impute the simulated missing data which is the randomly removed with varying frequencies (5 to 25% by 5%) from original veterinary dataset. Then highly accurate methods selected to impute original dataset for observation of influence in classifier performance and to determine the optimal imputation method for the original dataset. 

Keywords

References

  1. [1] J. L. Schafer, Analysis of incomplete multivariate data: Chapman and Hall/CRC, 1997.
  2. [2] I. R. Dohoo, C. R. Nielsen, and U. Emanuelson, "Multiple imputation in veterinary epidemiological studies: a case study and simulation," Preventive veterinary medicine, vol. 129, pp. 35-47, 2016.
  3. [3] G. Ser and S. Keskin, "EXAMINING OF MULTIPLE IMPUTATION METHOD IN TWO MISSING OBSERVATION MECHANISMS," JAPS, Journal of Animal and Plant Sciences, vol. 26, pp. 594-598, 2016.
  4. [4] P. Cihan, E. Gökçe, and O. Kalıpsız, "A review of machine learning applications in veterinary field," Kafkas Univ Vet Fak Derg, vol. 23, pp. 673-680, 2017.
  5. [5] O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, et al., "Missing value estimation methods for DNA microarrays," Bioinformatics, vol. 17, pp. 520-525, 2001.
  6. [6] S. Van Buuren, H. C. Boshuizen, and D. L. Knook, "Multiple imputation of missing blood pressure covariates in survival analysis," Statistics in medicine, vol. 18, pp. 681-694, 1999.
  7. [7] D. J. Stekhoven and P. Bühlmann, "MissForest—non-parametric missing value imputation for mixed-type data," Bioinformatics, vol. 28, pp. 112-118, 2011.
  8. [8] E. HM, "An epidemiological study on neonatal lamb health," Kafkas Üniversitesi Veteriner Fakültesi Dergisi, vol. 15, 2009.

Details

Primary Language

English

Subjects

Computer Software

Journal Section

Research Article

Publication Date

December 1, 2019

Submission Date

January 22, 2019

Acceptance Date

July 23, 2019

Published in Issue

Year 2019 Volume: 23 Number: 6

APA
Cihan, P., Kalıpsız, O., & Gökçe, E. (2019). Effect of Imputation Methods in the Classifier Performance. Sakarya University Journal of Science, 23(6), 1225-1236. https://izlik.org/JA97BB64MM
AMA
1.Cihan P, Kalıpsız O, Gökçe E. Effect of Imputation Methods in the Classifier Performance. SAUJS. 2019;23(6):1225-1236. https://izlik.org/JA97BB64MM
Chicago
Cihan, Pinar, Oya Kalıpsız, and Erhan Gökçe. 2019. “Effect of Imputation Methods in the Classifier Performance”. Sakarya University Journal of Science 23 (6): 1225-36. https://izlik.org/JA97BB64MM.
EndNote
Cihan P, Kalıpsız O, Gökçe E (December 1, 2019) Effect of Imputation Methods in the Classifier Performance. Sakarya University Journal of Science 23 6 1225–1236.
IEEE
[1]P. Cihan, O. Kalıpsız, and E. Gökçe, “Effect of Imputation Methods in the Classifier Performance”, SAUJS, vol. 23, no. 6, pp. 1225–1236, Dec. 2019, [Online]. Available: https://izlik.org/JA97BB64MM
ISNAD
Cihan, Pinar - Kalıpsız, Oya - Gökçe, Erhan. “Effect of Imputation Methods in the Classifier Performance”. Sakarya University Journal of Science 23/6 (December 1, 2019): 1225-1236. https://izlik.org/JA97BB64MM.
JAMA
1.Cihan P, Kalıpsız O, Gökçe E. Effect of Imputation Methods in the Classifier Performance. SAUJS. 2019;23:1225–1236.
MLA
Cihan, Pinar, et al. “Effect of Imputation Methods in the Classifier Performance”. Sakarya University Journal of Science, vol. 23, no. 6, Dec. 2019, pp. 1225-36, https://izlik.org/JA97BB64MM.
Vancouver
1.Pinar Cihan, Oya Kalıpsız, Erhan Gökçe. Effect of Imputation Methods in the Classifier Performance. SAUJS [Internet]. 2019 Dec. 1;23(6):1225-36. Available from: https://izlik.org/JA97BB64MM


INDEXING & ABSTRACTING & ARCHIVING

33418 33537  30939     30940 30943 30941  30942  33255    33253  33254

30944  30945  30946   34239




30930Bu eser Creative Commons Atıf-Ticari Olmayan 4.0 Uluslararası Lisans   kapsamında lisanslanmıştır .