Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering

Özge Taş

doi:10.54569/aair.1597930

Research Article

Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering

Year 2024, Volume: 4 Issue: 2, 96 - 110, 30.12.2024

Özge Taş

https://doi.org/10.54569/aair.1597930

Abstract

Recommendation systems (RS) can be defined as systems that aim to offer personalized product and service recommendations to users based on users' past product preferences and similarities with other users in the system, especially in systems that provide e-commerce services. The main purpose of RS is to reveal meaningful information from large-scale data to users and to recommend systems that aim to simplify the analysis of user behaviors and product attributes. It is possible to divide the techniques used in RS into two main categories content-based and collaborative filtering (CF) according to the information they receive as input. Content-based recommendation systems focus on analyzing the attributes of items such as articles, movies or music to generate tailored recommendations. CF methods analyze user-generated scores for products and services to identify patterns and preferences. The success of CF techniques hinges on accurately identifying user similarities within large datasets. However, in CF techniques, large-scale data sets consisting of a large number of users and the scores given by users to these products are used. Consequently, identifying user similarities in such extensive datasets poses significant challenges. Two different methods are used to overcome this problem. The first method applies clustering analysis to divide the dataset into smaller subsets (user or product), followed by the application of CF techniques. In the other method, dimensionality reduction is performed on a product (object) basis using Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) methods. Up to now, many studies have been carried out using clustering analysis and variable dimensionality reduction methods Despite extensive research, a thorough comparison of clustering and dimensionality reduction methods on real-world datasets remains unexplored. This study aims to compare the performances of eleven clustering techniques of eleven clustering techniques, four of which are non-hierarchical seven of which are hierarchical clustering algorithms, and two variable dimensionality reduction techniques, consisting of SVD and PCA METHODS, in CF.

Keywords

Recommender Systems , Collaborative Filtering , Cluster Analysis , Dimension Reduction , Big Data

References

Cai, D., Wang, X., & He, X. (2009, June). Probabilistic dyadic data analysis with local and global consistency. In Proceedings of the 26th annual international conference on machine learning (pp. 105 112).
George T, Merugu S., (2005), A scalable collaborative ﬁltering framework based on co-clustering. In Proc. the 5th IEEE Int. Conf. Data Mining, Nov. pp.625-628.
Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
Heckerman D., Chickering D., Meek C., Rounthwaite R. and Kadie C., (2001) Dependency networks for inference, collaborative filtering, and data visualization. The Journal of Machine Learning Research, 1:49–75.
MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
Bakır, Ç., & Albayrak, S. (2014, April). User based and item based collaborative filtering with temporal dynamics. In 2014 22nd Signal Processing and Communications Applications Conference (Siu) (pp. 252-255). IEEE.
Sarwar B., Karypis G., Konstan J. and Riedl J., (2001) Item-based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW ’01). ACM, 285– 295. DOI:http://dx.doi.org/10.1145/371920.372071.
Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
Xu R,Wunsch D., (2005) . Survey Of Clustering Algorithms, IEEE Transactionson Neural Networks, 16(3):645–678.
Altinisik, A., Yildirim, U., & Topcu, Y. I. (2022). Evaluation of failure risks for manual tightening operations in automotive assembly lines. Assembly Automation, 42(5), 653-676.
Koohi, H., Kiani, K. (2016), User based collaborative filtering using fuzzy c-means, Measurement, 91:134-139.
Chen, J., Wang, H., & Yan, Z. (2018). Evolutionary heterogeneous clustering for rating prediction based on user collaborative filtering. Swarm and Evolutionary Computation, 38, 35-41.
Liao, C.L., Lee, S.J. (2016) A clustering based approach to improving the efficiency of collaborative filtering recommendation, Electronic Commerce Research and Applications,18:1-9.
Ba, J. ve Frey, B. (2013). Derin sinir ağlarını eğitmek için uyarlanabilir bırakma. Sinirsel bilgi işleme sistemlerindeki gelişmeler , 26 .Chicago
Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
Roelofsen, P. (2018), Time Series Clustering, Master Thesis, Vrıje Unıversıteıt, Amsterdam, 83s.
MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
Kaufman, L. ve Rousseeuw, PJ (2009). Verilerde grupları bulma: kümeleme analizine giriş . John Wiley & Sons.
Kohonen T. (1995) Learning Vector Quantization. In: Self-Organizing Maps. Springer Series in Information Sciences, vol 30. Springer, Berlin, Heidelberg pp 175-189.
Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
X. Zhang, D. Rajan, and B. Story, “Concrete crack detection using context-aware deep semantic segmentation network,” Computer-Aided Civil and Infrastructure Engineering, 34(11) (2019) 951–971; https://doi.org/10.1111/mice.12477.
Konstan, J.A., Riedl, J. (2012) Recommender systems: from algorithms to user experience , Adapt Interact 22: 101–23 .
Pan, C., Li. W. (2010) Research paper recommendation with topic analysis. In Computer Design and Applications IEEE 4, pp V4-264 .
Konstan J.A., Miller B.N., Maltz D., Herlocker J.L., Gordon L.R., Riedl J., (1997), Applying collaborative filtering to Usenet news.Commun ACM; 40(3):77-87.
Link 1 , (https://www.kaggle.com/datasets) , (Jester Collaborative Filtering Dataset) , (Restoran_tavsiye_sistemi) , (Recommendation System (CF) | Anime ),01.08.2023
Link 2, https://github.com/Ramakrishna05/Recommendation-Algorithm, 01.08.2023.
Link 3, Web: https://bookdown.org/egarpor/PM-UC3M/lm-ii-dimred.html

Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering

Year 2024, Volume: 4 Issue: 2, 96 - 110, 30.12.2024

Özge Taş

https://doi.org/10.54569/aair.1597930

Abstract

Keywords

Recommender Systems , Collaborative Filtering , Cluster Analysis , Dimension Reduction , Big Data.

References

Cai, D., Wang, X., & He, X. (2009, June). Probabilistic dyadic data analysis with local and global consistency. In Proceedings of the 26th annual international conference on machine learning (pp. 105 112).
George T, Merugu S., (2005), A scalable collaborative ﬁltering framework based on co-clustering. In Proc. the 5th IEEE Int. Conf. Data Mining, Nov. pp.625-628.
Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
Heckerman D., Chickering D., Meek C., Rounthwaite R. and Kadie C., (2001) Dependency networks for inference, collaborative filtering, and data visualization. The Journal of Machine Learning Research, 1:49–75.
MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
Bakır, Ç., & Albayrak, S. (2014, April). User based and item based collaborative filtering with temporal dynamics. In 2014 22nd Signal Processing and Communications Applications Conference (Siu) (pp. 252-255). IEEE.
Sarwar B., Karypis G., Konstan J. and Riedl J., (2001) Item-based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW ’01). ACM, 285– 295. DOI:http://dx.doi.org/10.1145/371920.372071.
Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
Xu R,Wunsch D., (2005) . Survey Of Clustering Algorithms, IEEE Transactionson Neural Networks, 16(3):645–678.
Altinisik, A., Yildirim, U., & Topcu, Y. I. (2022). Evaluation of failure risks for manual tightening operations in automotive assembly lines. Assembly Automation, 42(5), 653-676.
Koohi, H., Kiani, K. (2016), User based collaborative filtering using fuzzy c-means, Measurement, 91:134-139.
Chen, J., Wang, H., & Yan, Z. (2018). Evolutionary heterogeneous clustering for rating prediction based on user collaborative filtering. Swarm and Evolutionary Computation, 38, 35-41.
Liao, C.L., Lee, S.J. (2016) A clustering based approach to improving the efficiency of collaborative filtering recommendation, Electronic Commerce Research and Applications,18:1-9.
Ba, J. ve Frey, B. (2013). Derin sinir ağlarını eğitmek için uyarlanabilir bırakma. Sinirsel bilgi işleme sistemlerindeki gelişmeler , 26 .Chicago
Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
Roelofsen, P. (2018), Time Series Clustering, Master Thesis, Vrıje Unıversıteıt, Amsterdam, 83s.
MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
Kaufman, L. ve Rousseeuw, PJ (2009). Verilerde grupları bulma: kümeleme analizine giriş . John Wiley & Sons.
Kohonen T. (1995) Learning Vector Quantization. In: Self-Organizing Maps. Springer Series in Information Sciences, vol 30. Springer, Berlin, Heidelberg pp 175-189.
Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
X. Zhang, D. Rajan, and B. Story, “Concrete crack detection using context-aware deep semantic segmentation network,” Computer-Aided Civil and Infrastructure Engineering, 34(11) (2019) 951–971; https://doi.org/10.1111/mice.12477.
Konstan, J.A., Riedl, J. (2012) Recommender systems: from algorithms to user experience , Adapt Interact 22: 101–23 .
Pan, C., Li. W. (2010) Research paper recommendation with topic analysis. In Computer Design and Applications IEEE 4, pp V4-264 .
Konstan J.A., Miller B.N., Maltz D., Herlocker J.L., Gordon L.R., Riedl J., (1997), Applying collaborative filtering to Usenet news.Commun ACM; 40(3):77-87.
Link 1 , (https://www.kaggle.com/datasets) , (Jester Collaborative Filtering Dataset) , (Restoran_tavsiye_sistemi) , (Recommendation System (CF) | Anime ),01.08.2023
Link 2, https://github.com/Ramakrishna05/Recommendation-Algorithm, 01.08.2023.
Link 3, Web: https://bookdown.org/egarpor/PM-UC3M/lm-ii-dimred.html

There are 29 citations in total.

Details

Primary Language	English
Subjects	Machine Learning (Other)
Journal Section	Research Articles
Authors	Özge Taş 0000-0001-7220-5054
Publication Date	December 30, 2024
Submission Date	December 7, 2024
Acceptance Date	December 28, 2024
Published in Issue	Year 2024 Volume: 4 Issue: 2

Cite

IEEE	Ö. Taş, “Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering”, Adv. Artif. Intell. Res., vol. 4, no. 2, pp. 96–110, 2024, doi: 10.54569/aair.1597930.

Download Cover Image

Article Files

Full Text

Advances in Artificial Intelligence Research is an open access journal which means that the content is freely available without charge to the user or his/her institution. All papers are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows users to distribute, remix, adapt, and build upon the material in any medium or format for non-commercial purposes only, and only so long as attribution is given to the creator.

Graphic design @ Özden Işıktaş