Research Article

Comparison of the effect of unsupervised and supervised discretization methods on classification process

Volume: 4 Number: Special Issue-1 December 26, 2016
EN

Comparison of the effect of unsupervised and supervised discretization methods on classification process

Abstract

Most of the machine learning and data mining algorithms use discrete data for the classification process. But, most data in practice include continuous features. Therefore, a discretization pre-processing step is applied on these datasets before the classification. Discretization process converts continuous values to discrete values. In the literature, there are many methods used for discretization process. These methods are grouped as supervised and unsupervised methods according to whether a class information is used or not. In this paper, we used two unsupervised methods: Equal Width Interval (EW), Equal Frequency (EF) and one supervised method: Entropy Based (EB) discretization. In the experiments, a well-known 10 dataset from UCI (Machine Learning Repository) is used in order to compare the effect of the discretization methods on the classification. The results show that, Naive Bayes (NB), C4.5 and ID3 classification algorithms obtain higher accuracy with EB discretization method.

Keywords

References

  1. [1] Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. Elsevier.
  2. [2] Dougherty, J., Kohavi, R., & Sahami, M. (1995, July). Supervised and unsupervised discretization of continuous features. In Machine learning: proceedings of the twelfth international conference (Vol. 12, pp. 194-202).
  3. [3] Hacibeyoglu, M., Arslan, A., & Kahramanli, S. (2011). Improving Classification Accuracy with Discretization on Data Sets Including Continuous Valued Features. Ionosphere, 34(351), 2.
  4. [4] Gupta, A., Mehrotra, K. G., & Mohan, C. (2010). A clustering-based discretization for supervised learning. Statistics & probability letters, 80(9), 816-824.
  5. [5] Joiţa, D. (2010). Unsupervised static discretization methods in data mining. Titu Maiorescu University, Bucharest, Romania.
  6. [6] Gama, J., & Pinto, C. (2006, April). Discretization from data streams: applications to histograms and data mining. In Proceedings of the 2006 ACM symposium on Applied computing (pp. 662-667). ACM.
  7. [7] Jiang, S. Y., Li, X., Zheng, Q., & Wang, L. X. (2009, May). Approximate equal frequency discretization method. In 2009 WRI Global Congress on Intelligent Systems (Vol. 3, pp. 514-518). IEEE.
  8. [8] Agre, G., & Peev, S. (2002). On supervised and unsupervised discretization. Cybernetics and information technologies, 2(2), 43-57.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Authors

Mehmet Hacıbeyoğlu This is me
Türkiye

Publication Date

December 26, 2016

Submission Date

November 22, 2016

Acceptance Date

December 1, 2016

Published in Issue

Year 2016 Volume: 4 Number: Special Issue-1

APA
Hacıbeyoğlu, M., & Ibrahım, M. H. (2016). Comparison of the effect of unsupervised and supervised discretization methods on classification process. International Journal of Intelligent Systems and Applications in Engineering, 4(Special Issue-1), 105-108. https://doi.org/10.18201/ijisae.267490
AMA
1.Hacıbeyoğlu M, Ibrahım MH. Comparison of the effect of unsupervised and supervised discretization methods on classification process. International Journal of Intelligent Systems and Applications in Engineering. 2016;4(Special Issue-1):105-108. doi:10.18201/ijisae.267490
Chicago
Hacıbeyoğlu, Mehmet, and Mohammed H. Ibrahım. 2016. “Comparison of the Effect of Unsupervised and Supervised Discretization Methods on Classification Process”. International Journal of Intelligent Systems and Applications in Engineering 4 (Special Issue-1): 105-8. https://doi.org/10.18201/ijisae.267490.
EndNote
Hacıbeyoğlu M, Ibrahım MH (December 1, 2016) Comparison of the effect of unsupervised and supervised discretization methods on classification process. International Journal of Intelligent Systems and Applications in Engineering 4 Special Issue-1 105–108.
IEEE
[1]M. Hacıbeyoğlu and M. H. Ibrahım, “Comparison of the effect of unsupervised and supervised discretization methods on classification process”, International Journal of Intelligent Systems and Applications in Engineering, vol. 4, no. Special Issue-1, pp. 105–108, Dec. 2016, doi: 10.18201/ijisae.267490.
ISNAD
Hacıbeyoğlu, Mehmet - Ibrahım, Mohammed H. “Comparison of the Effect of Unsupervised and Supervised Discretization Methods on Classification Process”. International Journal of Intelligent Systems and Applications in Engineering 4/Special Issue-1 (December 1, 2016): 105-108. https://doi.org/10.18201/ijisae.267490.
JAMA
1.Hacıbeyoğlu M, Ibrahım MH. Comparison of the effect of unsupervised and supervised discretization methods on classification process. International Journal of Intelligent Systems and Applications in Engineering. 2016;4:105–108.
MLA
Hacıbeyoğlu, Mehmet, and Mohammed H. Ibrahım. “Comparison of the Effect of Unsupervised and Supervised Discretization Methods on Classification Process”. International Journal of Intelligent Systems and Applications in Engineering, vol. 4, no. Special Issue-1, Dec. 2016, pp. 105-8, doi:10.18201/ijisae.267490.
Vancouver
1.Mehmet Hacıbeyoğlu, Mohammed H. Ibrahım. Comparison of the effect of unsupervised and supervised discretization methods on classification process. International Journal of Intelligent Systems and Applications in Engineering. 2016 Dec. 1;4(Special Issue-1):105-8. doi:10.18201/ijisae.267490

Cited By