Research Article
BibTex RIS Cite

Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques

Year 2021, Volume: 2 Issue: 1, 16 - 23, 15.06.2021

Abstract

The firms which are specialized in hotel bookings generally have huge amounts of hotels with hundreds of features in their database. To be able to get the most meaningful insights from that data, it is vital to use the right machine learning techniques for segmenting those hotels into meaningful groups and finding their most important features. In this study, hotels data from Setur firm have been used for clustering, dimensionality reduction and feature selection analysis. Firstly, hotels were clustered by KMeans Clustering algorithm according to the similarity of their features. To see the effect of dimensionality reduction technique on the clustering process of hotels data, PCA(Principal Component Analysis) method was applied on hotels data and KMeans Clustering algorithm was applied to this processed data in order to observe the differences between the clustering results when PCA is applied and not applied. After that, multivariate and univariate feature selection techniques were applied to the clustered hotels data for identifying the most important features of hotels which have effect on clustering process. As a multivariate feature selection technique, Random Forest algorithm was used. For the univariate technique, SelectKBest algorithm with chi2 score function was used as a filter-based feature selection method.

References

  • Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.
Year 2021, Volume: 2 Issue: 1, 16 - 23, 15.06.2021

Abstract

References

  • Kassambara, A. (2017). Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning. In Multivariate Analysis, (1), 101. Jain, A. K. (2010). Data clustering: 50 years beyond k-means. In Pattern recognition letters, 31(8), 651–666. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series), 11. Breiman, L. (2001). Random Forests. In Machine Learning, 45(1), 5–32.
There are 1 citations in total.

Details

Primary Language English
Subjects Artificial Intelligence
Journal Section Research Articles
Authors

Mert Akyol 0000-0002-3499-0001

Publication Date June 15, 2021
Submission Date February 22, 2021
Acceptance Date March 20, 2021
Published in Issue Year 2021 Volume: 2 Issue: 1

Cite

APA Akyol, M. (2021). Clustering Hotels and Analyzing the Importance of Their Features by Machine Learning Techniques. Bilgisayar Bilimleri Ve Teknolojileri Dergisi, 2(1), 16-23.