Improving Iris Dataset Classification Prediction Achievement By Using Optimum k Value of kNN Algorithm

Ahmet Çelik

doi:10.53608/estudambilisim.1071335

Research Article

Improving Iris Dataset Classification Prediction Achievement By Using Optimum k Value of kNN Algorithm

Year 2022, Volume: 3 Issue: 2, 23 - 30, 31.05.2022

Ahmet Çelik

https://doi.org/10.53608/estudambilisim.1071335

Cited By: 1

Abstract

Machine learning methods are widely used in automated technologies. Classification prediction is a machine learning based on data mining. Today, many technological devices can make new predictions by gaining experience from past data with machine learning methods. Machine learning is widely studied in two types, supervised and unsupervised. The limits of the objectives in supervised learning are predetermined. In unsupervised learning, there are no predetermined targets. In this learning, the machines are required to determine the targets automatically. Prediction process is one of the basic components of machine learning. Machines need to use some algorithms in order to perform the prediction process on the basis of data mining. k nearest neighbor (kNN), Naive Bayes (NB), Decision Tree (DT) and Support Vector Machine (SVM) algorithms are used mostly. k nearest neighbor (kNN), Naive Bayes (NB), Decision Tree (DT) and Support Vector Machine (SVM) algorithms are used mostly. These algorithms can be applied with the help of some tools on data sets. In this study, kNN algorithm was used to estimate Iris data set classification using Orange tool. The success of the KNN algorithm depends on using the correct attribute and changing the optimum k value. As a result of the tests, it was determined that when the k neighbor value was selected as 15, it was the most suitable k neighbor value, providing 98.67% classification prediction success in the Iris dataset.

Keywords

Classification , Prediction , kNN , Data mining , Machine learning

References

[1] Lahmer, H., Oueslati, A. E., Lachiri, Z., “DNA Microarray Analysis Using Machine Learning to Recognize Cell Cycle Regulated Genes”, 2019 International Conference on Control, Automation and Diagnosis (ICCAD); 2-4 July 2019; Grenoble, France. 2019; pp. 1-5.
[2] Sasikala, B. S., Biju V. G., Prashanth, C. M., “Kappa and accuracy evaluations of machine learning classifiers”. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT); 19-20 May 2017 ; Bangalore, India, 2017; pp. 20-23.
[3] Thirunavukkarasu, K., Singh, A. S., Rai P., Gupta, S., “Classification of IRIS Dataset using Classification Based KNN Algorithm in Supervised Learning”, 2018 4th International Conference on Computing Communication and Automation (ICCCA); 14-15 Dec. 2018; Greater Noida, India. 2018; pp 1-4.
[4] Goel, A., Mahajan S., “Comparison: KNN & SVM Algorithm”. International Journal for Research in Applied Science & Engineering Technology (IJRASET) 2017; 5(13): 165-168.
[5] Yigit, H., “A weighting approach for KNN classifier”, 2013 International Conference on Electronics, Computer and Computation (ICECCO); 7-9 Nov. 2013; Ankara, Türkiye. 2013; pp. 228-231.
[6] Kunju, M. V., Dainel, E., Anthony H. C., Bhelwa, S., “Evaluation of Phishing Techniques Based on Machine Learning”, 2019 International Conference on Intelligent Computing and Control Systems (ICCS); 15-17 May 2019; Madurai, India. 2019; pp. 963-968.
[7] Agrawal, S., Bansal A., Rathor, S., “Prediction of SGEMM GPU Kernel Performance Using Supervised and Unsupervised Machine Learning Techniques”, 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT); 10-12 July 2018; Bangalore, India. 2018; pp. 1-7.
[8] Alkhatib, K., Najadat, H., Hmeidi, I., Shatnawi, M. K., “Stock Price Prediction Using K-Nearest Neighbor (kNN) Algorithm”. International Journal of Business, Humanities and Technology 2013; 3(3): 32-44.
[9] Goel, M., Tiwari, A. K., Patil, H. S., “Recommendation Engine for B2B Customers in Telecom By Customizing KNN Algorithm”. JournalNX- A Multidisciplinary Peer Reviewed Journal 2018; 4(4): 5-7.
[10] Du, S. and Li, J., “Parallel Processing of Improved kNN Text Classification Algorithm Based on Hadoop”, 2019 7th International Conference on Information, Communication and Networks (ICICN); 24-26 April 2019; Macao, Macao, 2019; pp. 167-170.
[11] Zhao, Y., Qian Y. and Li, C., “Improved KNN text classification algorithm with MapReduce implementation”, 2017 4th International Conference on Systems and Informatics (ICSAI); 11-13 Nov. 2017; Hangzhou, China. 2017; pp. 1417-1422.
[12] Silahtaroğlu, G., “Veri madenciliği (Kavram ve algoritmaları)”. 3. Basım, İstanbul, Türkiye: Papatya Yayıncılık Eğitim; 2016. pp. 118-120.
[13] Balaban, M. E., Kartal, E., “Veri madenciliği ve makine öğrenmesi temel algoritmaları ve R Dili ile Uygulamalar”. 2. Basım, İstanbul, Türkiye: Çağlayan Kitap & Yayıncılık & Eğitim; 2018; 48-72.
[14] Pedregosa, F., Varoquaux G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., “Scikit-learn: Machine Learning in Python”. Journal of Machine Learning Research 2011; 12: 2825-2830
[15] Dua, D, Graff, C., “UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]”. Irvine, CA: University of California, School of Information and Computer Science; 2019.

kNN Algoritmasının Optimum k Değerini Kullanarak İris Veri Seti SınıflandırmaTahmin Başarısının İyileştirilmesi

Year 2022, Volume: 3 Issue: 2, 23 - 30, 31.05.2022

Ahmet Çelik

https://doi.org/10.53608/estudambilisim.1071335

Cited By: 1

Abstract

Otomatik çalışan teknolojilerde, makine öğrenmesi yöntemleri olarak yaygın kullanılmaktadır. Sınıflandırma tahmini, veri madenciliği temeline dayanarak gerçekleştirilen bir makine öğrenmesidir. Makine öğrenmesi, makinelerin geçmiş verilerden tecrübe elde ederek yeni tahminlerde bulunmasına olanak sağlamaktadır. Makine öğrenmesi yaygın olarak denetimli ve denetimsiz olarak iki tür olarak incelenmektedir. Denetimli öğrenmede hedeflerin sınırları önceden belirlenmiştir. Denetimsiz öğrenmede ise önceden belirlenmiş hedefler yoktur. Bilgisayarların hedefleri otomatik belirlemesi istenmektedir. Tahmin işlemi makine öğrenmesinin temel bileşenlerinden birini oluşturmaktadır. Bilgisayarlar tahmin işlemini gerçekleştirebilmek için veri madenciliği temelinde bazı algoritmaları kullanması gerekmektedir. En çok k en yakın komşu (KNN), Naive Bayes (NB), Karar Ağacı (DT) ve Destek Vektör Makinesi (SVM) algoritmaları kullanılmaktadır. Bu algoritmalar bazı araçlar kullanılarak veri setleri üzerinde uygulanabilmektedir. Bu çalışmada Orange aracı kullanılarak İris veri seti üzerinde KNN algoritmasıyla tahmin işlemi gerçekleştirilmiştir. KNN algoritmasının başarısı doğru öznitelik kullanmaya ve optimum k değerinin kullanılmasına bağlıdır. Yapılan testler sonucunda, k komşu değeri 15 seçildiğinde İris veri setinde %98,67 sınıflandırma tahmin başarısı sağlayarak, en uygun k komşu değeri olduğu belirlenmiştir.

Keywords

Sınıflandırma , Tahmin , kNN , Veri Madenciliği , Makine Öğrenmesi

References

[1] Lahmer, H., Oueslati, A. E., Lachiri, Z., “DNA Microarray Analysis Using Machine Learning to Recognize Cell Cycle Regulated Genes”, 2019 International Conference on Control, Automation and Diagnosis (ICCAD); 2-4 July 2019; Grenoble, France. 2019; pp. 1-5.
[2] Sasikala, B. S., Biju V. G., Prashanth, C. M., “Kappa and accuracy evaluations of machine learning classifiers”. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT); 19-20 May 2017 ; Bangalore, India, 2017; pp. 20-23.
[3] Thirunavukkarasu, K., Singh, A. S., Rai P., Gupta, S., “Classification of IRIS Dataset using Classification Based KNN Algorithm in Supervised Learning”, 2018 4th International Conference on Computing Communication and Automation (ICCCA); 14-15 Dec. 2018; Greater Noida, India. 2018; pp 1-4.
[4] Goel, A., Mahajan S., “Comparison: KNN & SVM Algorithm”. International Journal for Research in Applied Science & Engineering Technology (IJRASET) 2017; 5(13): 165-168.
[5] Yigit, H., “A weighting approach for KNN classifier”, 2013 International Conference on Electronics, Computer and Computation (ICECCO); 7-9 Nov. 2013; Ankara, Türkiye. 2013; pp. 228-231.
[6] Kunju, M. V., Dainel, E., Anthony H. C., Bhelwa, S., “Evaluation of Phishing Techniques Based on Machine Learning”, 2019 International Conference on Intelligent Computing and Control Systems (ICCS); 15-17 May 2019; Madurai, India. 2019; pp. 963-968.
[7] Agrawal, S., Bansal A., Rathor, S., “Prediction of SGEMM GPU Kernel Performance Using Supervised and Unsupervised Machine Learning Techniques”, 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT); 10-12 July 2018; Bangalore, India. 2018; pp. 1-7.
[8] Alkhatib, K., Najadat, H., Hmeidi, I., Shatnawi, M. K., “Stock Price Prediction Using K-Nearest Neighbor (kNN) Algorithm”. International Journal of Business, Humanities and Technology 2013; 3(3): 32-44.
[9] Goel, M., Tiwari, A. K., Patil, H. S., “Recommendation Engine for B2B Customers in Telecom By Customizing KNN Algorithm”. JournalNX- A Multidisciplinary Peer Reviewed Journal 2018; 4(4): 5-7.
[10] Du, S. and Li, J., “Parallel Processing of Improved kNN Text Classification Algorithm Based on Hadoop”, 2019 7th International Conference on Information, Communication and Networks (ICICN); 24-26 April 2019; Macao, Macao, 2019; pp. 167-170.
[11] Zhao, Y., Qian Y. and Li, C., “Improved KNN text classification algorithm with MapReduce implementation”, 2017 4th International Conference on Systems and Informatics (ICSAI); 11-13 Nov. 2017; Hangzhou, China. 2017; pp. 1417-1422.
[12] Silahtaroğlu, G., “Veri madenciliği (Kavram ve algoritmaları)”. 3. Basım, İstanbul, Türkiye: Papatya Yayıncılık Eğitim; 2016. pp. 118-120.
[13] Balaban, M. E., Kartal, E., “Veri madenciliği ve makine öğrenmesi temel algoritmaları ve R Dili ile Uygulamalar”. 2. Basım, İstanbul, Türkiye: Çağlayan Kitap & Yayıncılık & Eğitim; 2018; 48-72.
[14] Pedregosa, F., Varoquaux G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., “Scikit-learn: Machine Learning in Python”. Journal of Machine Learning Research 2011; 12: 2825-2830
[15] Dua, D, Graff, C., “UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]”. Irvine, CA: University of California, School of Information and Computer Science; 2019.

There are 15 citations in total.

Details

Primary Language	English
Subjects	Computer Software
Journal Section	Research Articles
Authors	Ahmet Çelik 0000-0002-6288-3182
Publication Date	May 31, 2022
Submission Date	February 11, 2022
Acceptance Date	March 24, 2022
Published in Issue	Year 2022 Volume: 3 Issue: 2

Cite

IEEE	A. Çelik, “Improving Iris Dataset Classification Prediction Achievement By Using Optimum k Value of kNN Algorithm”, Journal of ESTUDAM Information, vol. 3, no. 2, pp. 23–30, 2022, doi: 10.53608/estudambilisim.1071335.

Eskişehir Türk Dünyası Uygulama ve Araştırma Merkezi Bilişim Dergisi

Improving Iris Dataset Classification Prediction Achievement By Using Optimum k Value of kNN Algorithm

Abstract

Keywords

References

kNN Algoritmasının Optimum k Değerini Kullanarak İris Veri Seti SınıflandırmaTahmin Başarısının İyileştirilmesi

Abstract

Keywords

References

Details

Cite

Cited By

Determination of the Classification Success of KNN Algorithm Distance Metric Methods on Wheat Seeds Dataset

Afyon Kocatepe University Journal of Sciences and Engineering

https://doi.org/10.35414/akufemubid.1263900