Araştırma Makalesi
BibTex RIS Kaynak Göster

Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques

Yıl 2021, Cilt: 2 Sayı: 1, 16 - 23, 15.06.2021

Öz

The firms which are specialized in hotel bookings generally have huge amounts of hotels with hundreds of features in their database. To be able to get the most meaningful insights from that data, it is vital to use the right machine learning techniques for segmenting those hotels into meaningful groups and finding their most important features. In this study, hotels data from Setur firm have been used for clustering, dimensionality reduction and feature selection analysis. Firstly, hotels were clustered by KMeans Clustering algorithm according to the similarity of their features. To see the effect of dimensionality reduction technique on the clustering process of hotels data, PCA(Principal Component Analysis) method was applied on hotels data and KMeans Clustering algorithm was applied to this processed data in order to observe the differences between the clustering results when PCA is applied and not applied. After that, multivariate and univariate feature selection techniques were applied to the clustered hotels data for identifying the most important features of hotels which have effect on clustering process. As a multivariate feature selection technique, Random Forest algorithm was used. For the univariate technique, SelectKBest algorithm with chi2 score function was used as a filter-based feature selection method.

Kaynakça

  • Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.
Yıl 2021, Cilt: 2 Sayı: 1, 16 - 23, 15.06.2021

Öz

Kaynakça

  • Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.
Toplam 1 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Yapay Zeka
Bölüm Araştırma Makaleleri
Yazarlar

Mert Akyol 0000-0002-3499-0001

Yayımlanma Tarihi 15 Haziran 2021
Gönderilme Tarihi 22 Şubat 2021
Kabul Tarihi 20 Mart 2021
Yayımlandığı Sayı Yıl 2021 Cilt: 2 Sayı: 1

Kaynak Göster

APA Akyol, M. (2021). Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. Bilgisayar Bilimleri Ve Teknolojileri Dergisi, 2(1), 16-23.