Araştırma Makalesi

A Comparison of Five Methods for Missing Value Imputation in Data Sets

Cilt: 2 Sayı: 2 31 Aralık 2018
PDF İndir
TR EN

A Comparison of Five Methods for Missing Value Imputation in Data Sets

Öz

The missing values in the data sets do not allow for accurate analysis. Therefore, the correct imputation of missing values has become the focus of attention of researchers in recent years. This paper focuses on a comparison of most reliable and up to date estimation methods to imputing the missing values. Imputation of missing values has a very high priority because of its impact on next pre-processing, data analysis, classification, clustering, etc. Root mean square error (RMSE) value, classification accuracy and execution time are used to evaluate the performances of most popular five methods (mean, k-nearest neighbors, singular value decomposition, bayesian principal component analysis and missForest). When RMSE and classification accuracy values of methods were compared, it has observed that missForest method outperformed other methods in all datasets.

Anahtar Kelimeler

Kaynakça

  1. [1] T.D. Pigott, “A review of methods for missing data”, Educational Resarch and Evaluation, Cilt. 7, s. 353-383. DOI: 10.1076/edre.7.4.353.8937, 2001.
  2. [2] P.D. Allison, “Missing data techniques for structural equation modeling”, Journal of Abnormal Psychology, Cilt. 4, s. 545-557. DOI: 10.1037/0021-843X.112.4.545, 2003.
  3. [3] J.W. Osborne, “Best practices in data cleaning”, California: Sage Publication, Inc., s. 596, 2013.
  4. [4] A.G. Di Nuovo, “Missing data analysis with fuzzy C-Means: A study of its application in a psychological scenario”, Expert Syst Appl, Cilt. 38, s. 6793-6797, DOI: 10.1016/j.eswa.2010.12.067, 2011.
  5. [5] C. Bergmeir, J.M. Benitez, “On the use of cross-validation for time series predictor evaluation”, Inform Sciences, Cilt. 191, s. 192-213, DOI: 10.1016/j.ins.2011.12.028, 2012.
  6. [6] J. Van Hulse, and T.M. Khoshgoftaar, “Incomplete-case nearest neighbor imputation in software measurement data”, IRI 2007: Proceedings of the 2007 IEEE International Conference on Information Reuse and Integration, s. 630-637 DOI: 10.1109/IRI.2007.4296691, 2007.
  7. [7] S. Genc, F.E.Boran, D. Akay, and Z.S. Xu, “Interval multiplicative transitivity for consistency, missing values and priority weights of interval fuzzy preference relations”, Inform Sciences, Cilt. 180, s. 4877-4891, DOI: 10.1016/j.ins.2010.08.019, 2010.
  8. [8] R.J.A. Little, and D.B. Rubin, “Statistical Analysis with Missing Data”, 333. John Wiley & Sons, 2014.

Ayrıntılar

Birincil Dil

Türkçe

Konular

Bilgisayar Yazılımı

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

31 Aralık 2018

Gönderilme Tarihi

28 Kasım 2018

Kabul Tarihi

12 Aralık 2018

Yayımlandığı Sayı

Yıl 2018 Cilt: 2 Sayı: 2

Kaynak Göster

APA
Cihan, P. (2018). A Comparison of Five Methods for Missing Value Imputation in Data Sets. International Scientific and Vocational Studies Journal, 2(2), 80-85. https://izlik.org/JA22BM54LC
AMA
1.Cihan P. A Comparison of Five Methods for Missing Value Imputation in Data Sets. ISVOS. 2018;2(2):80-85. https://izlik.org/JA22BM54LC
Chicago
Cihan, Pınar. 2018. “A Comparison of Five Methods for Missing Value Imputation in Data Sets”. International Scientific and Vocational Studies Journal 2 (2): 80-85. https://izlik.org/JA22BM54LC.
EndNote
Cihan P (01 Aralık 2018) A Comparison of Five Methods for Missing Value Imputation in Data Sets. International Scientific and Vocational Studies Journal 2 2 80–85.
IEEE
[1]P. Cihan, “A Comparison of Five Methods for Missing Value Imputation in Data Sets”, ISVOS, c. 2, sy 2, ss. 80–85, Ara. 2018, [çevrimiçi]. Erişim adresi: https://izlik.org/JA22BM54LC
ISNAD
Cihan, Pınar. “A Comparison of Five Methods for Missing Value Imputation in Data Sets”. International Scientific and Vocational Studies Journal 2/2 (01 Aralık 2018): 80-85. https://izlik.org/JA22BM54LC.
JAMA
1.Cihan P. A Comparison of Five Methods for Missing Value Imputation in Data Sets. ISVOS. 2018;2:80–85.
MLA
Cihan, Pınar. “A Comparison of Five Methods for Missing Value Imputation in Data Sets”. International Scientific and Vocational Studies Journal, c. 2, sy 2, Aralık 2018, ss. 80-85, https://izlik.org/JA22BM54LC.
Vancouver
1.Pınar Cihan. A Comparison of Five Methods for Missing Value Imputation in Data Sets. ISVOS [Internet]. 01 Aralık 2018;2(2):80-5. Erişim adresi: https://izlik.org/JA22BM54LC

ULUSLARARASI BİLİMSEL VE MESLEKİ ÇALIŞMALAR DERGİSİ, Creative Commons Atıf-GayrıTicari 4.0 Uluslararası (CC BY-NC 4.0) lisansı ile yayınlamasına izin verir. Creative Commons Atıf-GayrıTicari 4.0 Uluslararası (CC BY-NC 4.0) lisansı, eserin ticari kullanım dışında her boyut ve formatta paylaşılmasına, kopyalanmasına, çoğaltılmasına ve orijinal esere uygun şekilde atıfta bulunmak kaydıyla yeniden düzenleme, dönüştürme ve eserin üzerine inşa etme dâhil adapte edilmesine izin verir.