Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques

Mert Akyol

Araştırma Makalesi

Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques

Yıl 2021, Cilt: 2 Sayı: 1 , 16 - 23 , 15.06.2021

Mert Akyol

https://izlik.org/JA45NR47RU

Öz

The firms which are specialized in hotel bookings generally have huge amounts of hotels with hundreds of features in their database. To be able to get the most meaningful insights from that data, it is vital to use the right machine learning techniques for segmenting those hotels into meaningful groups and finding their most important features. In this study, hotels data from Setur firm have been used for clustering, dimensionality reduction and feature selection analysis. Firstly, hotels were clustered by KMeans Clustering algorithm according to the similarity of their features. To see the effect of dimensionality reduction technique on the clustering process of hotels data, PCA(Principal Component Analysis) method was applied on hotels data and KMeans Clustering algorithm was applied to this processed data in order to observe the differences between the clustering results when PCA is applied and not applied. After that, multivariate and univariate feature selection techniques were applied to the clustered hotels data for identifying the most important features of hotels which have effect on clustering process. As a multivariate feature selection technique, Random Forest algorithm was used. For the univariate technique, SelectKBest algorithm with chi2 score function was used as a filter-based feature selection method.

Anahtar Kelimeler

Machine Learning, , KMeans Clustering, , Principal Component Analysis , Elbow Method , Random Forest

Kaynakça

Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.

Yıl 2021, Cilt: 2 Sayı: 1 , 16 - 23 , 15.06.2021

Mert Akyol

https://izlik.org/JA45NR47RU

Öz

Kaynakça

Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.

Toplam 1 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Yapay Zeka
Bölüm	Araştırma Makalesi
Yazarlar	Mert Akyol 0000-0002-3499-0001
Gönderilme Tarihi	22 Şubat 2021
Kabul Tarihi	20 Mart 2021
Yayımlanma Tarihi	15 Haziran 2021
IZ	https://izlik.org/JA45NR47RU
Yayımlandığı Sayı	Yıl 2021 Cilt: 2 Sayı: 1

Kaynak Göster

APA	Akyol, M. (2021). Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. Bilgisayar Bilimleri ve Teknolojileri Dergisi, 2(1), 16-23. https://izlik.org/JA45NR47RU