Research Article

Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques

Volume: 2 Number: 1 June 15, 2021
EN

Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques

Abstract

The firms which are specialized in hotel bookings generally have huge amounts of hotels with hundreds of features in their database. To be able to get the most meaningful insights from that data, it is vital to use the right machine learning techniques for segmenting those hotels into meaningful groups and finding their most important features. In this study, hotels data from Setur firm have been used for clustering, dimensionality reduction and feature selection analysis. Firstly, hotels were clustered by KMeans Clustering algorithm according to the similarity of their features. To see the effect of dimensionality reduction technique on the clustering process of hotels data, PCA(Principal Component Analysis) method was applied on hotels data and KMeans Clustering algorithm was applied to this processed data in order to observe the differences between the clustering results when PCA is applied and not applied. After that, multivariate and univariate feature selection techniques were applied to the clustered hotels data for identifying the most important features of hotels which have effect on clustering process. As a multivariate feature selection technique, Random Forest algorithm was used. For the univariate technique, SelectKBest algorithm with chi2 score function was used as a filter-based feature selection method.

Keywords

Machine Learning,, KMeans Clustering,, Principal Component Analysis, Elbow Method, Random Forest

References

  1. Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.
APA
Akyol, M. (2021). Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. Bilgisayar Bilimleri Ve Teknolojileri Dergisi, 2(1), 16-23. https://izlik.org/JA45NR47RU
AMA
1.Akyol M. Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. BIBTED. 2021;2(1):16-23. https://izlik.org/JA45NR47RU
Chicago
Akyol, Mert. 2021. “Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques”. Bilgisayar Bilimleri Ve Teknolojileri Dergisi 2 (1): 16-23. https://izlik.org/JA45NR47RU.
EndNote
Akyol M (June 1, 2021) Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. Bilgisayar Bilimleri ve Teknolojileri Dergisi 2 1 16–23.
IEEE
[1]M. Akyol, “Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques”, BIBTED, vol. 2, no. 1, pp. 16–23, June 2021, [Online]. Available: https://izlik.org/JA45NR47RU
ISNAD
Akyol, Mert. “Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques”. Bilgisayar Bilimleri ve Teknolojileri Dergisi 2/1 (June 1, 2021): 16-23. https://izlik.org/JA45NR47RU.
JAMA
1.Akyol M. Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. BIBTED. 2021;2:16–23.
MLA
Akyol, Mert. “Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques”. Bilgisayar Bilimleri Ve Teknolojileri Dergisi, vol. 2, no. 1, June 2021, pp. 16-23, https://izlik.org/JA45NR47RU.
Vancouver
1.Mert Akyol. Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. BIBTED [Internet]. 2021 Jun. 1;2(1):16-23. Available from: https://izlik.org/JA45NR47RU