Araştırma Makalesi
BibTex RIS Kaynak Göster

Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering

Yıl 2024, Cilt: 4 Sayı: 2, 96 - 110, 30.12.2024
https://doi.org/10.54569/aair.1597930

Öz

Recommendation systems (RS) can be defined as systems that aim to offer personalized product and service recommendations to users based on users' past product preferences and similarities with other users in the system, especially in systems that provide e-commerce services. The main purpose of RS is to reveal meaningful information from large-scale data to users and to recommend systems that aim to simplify the analysis of user behaviors and product attributes. It is possible to divide the techniques used in RS into two main categories content-based and collaborative filtering (CF) according to the information they receive as input. Content-based recommendation systems focus on analyzing the attributes of items such as articles, movies or music to generate tailored recommendations. CF methods analyze user-generated scores for products and services to identify patterns and preferences. The success of CF techniques hinges on accurately identifying user similarities within large datasets. However, in CF techniques, large-scale data sets consisting of a large number of users and the scores given by users to these products are used. Consequently, identifying user similarities in such extensive datasets poses significant challenges. Two different methods are used to overcome this problem. The first method applies clustering analysis to divide the dataset into smaller subsets (user or product), followed by the application of CF techniques. In the other method, dimensionality reduction is performed on a product (object) basis using Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) methods. Up to now, many studies have been carried out using clustering analysis and variable dimensionality reduction methods Despite extensive research, a thorough comparison of clustering and dimensionality reduction methods on real-world datasets remains unexplored. This study aims to compare the performances of eleven clustering techniques of eleven clustering techniques, four of which are non-hierarchical seven of which are hierarchical clustering algorithms, and two variable dimensionality reduction techniques, consisting of SVD and PCA METHODS, in CF.

Kaynakça

  • Cai, D., Wang, X., & He, X. (2009, June). Probabilistic dyadic data analysis with local and global consistency. In Proceedings of the 26th annual international conference on machine learning (pp. 105 112).
  • George T, Merugu S., (2005), A scalable collaborative filtering framework based on co-clustering. In Proc. the 5th IEEE Int. Conf. Data Mining, Nov. pp.625-628.
  • Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
  • Heckerman D., Chickering D., Meek C., Rounthwaite R. and Kadie C., (2001) Dependency networks for inference, collaborative filtering, and data visualization. The Journal of Machine Learning Research, 1:49–75.
  • MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
  • Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
  • Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
  • Bakır, Ç., & Albayrak, S. (2014, April). User based and item based collaborative filtering with temporal dynamics. In 2014 22nd Signal Processing and Communications Applications Conference (Siu) (pp. 252-255). IEEE.
  • Sarwar B., Karypis G., Konstan J. and Riedl J., (2001) Item-based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW ’01). ACM, 285– 295. DOI:http://dx.doi.org/10.1145/371920.372071.
  • Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
  • Xu R,Wunsch D., (2005) . Survey Of Clustering Algorithms, IEEE Transactionson Neural Networks, 16(3):645–678.
  • Altinisik, A., Yildirim, U., & Topcu, Y. I. (2022). Evaluation of failure risks for manual tightening operations in automotive assembly lines. Assembly Automation, 42(5), 653-676.
  • Koohi, H., Kiani, K. (2016), User based collaborative filtering using fuzzy c-means, Measurement, 91:134-139.
  • Chen, J., Wang, H., & Yan, Z. (2018). Evolutionary heterogeneous clustering for rating prediction based on user collaborative filtering. Swarm and Evolutionary Computation, 38, 35-41.
  • Liao, C.L., Lee, S.J. (2016) A clustering based approach to improving the efficiency of collaborative filtering recommendation, Electronic Commerce Research and Applications,18:1-9.
  • Ba, J. ve Frey, B. (2013). Derin sinir ağlarını eğitmek için uyarlanabilir bırakma. Sinirsel bilgi işleme sistemlerindeki gelişmeler , 26 .Chicago
  • Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
  • Roelofsen, P. (2018), Time Series Clustering, Master Thesis, Vrıje Unıversıteıt, Amsterdam, 83s.
  • MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
  • Kaufman, L. ve Rousseeuw, PJ (2009). Verilerde grupları bulma: kümeleme analizine giriş . John Wiley & Sons.
  • Kohonen T. (1995) Learning Vector Quantization. In: Self-Organizing Maps. Springer Series in Information Sciences, vol 30. Springer, Berlin, Heidelberg pp 175-189.
  • Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
  • X. Zhang, D. Rajan, and B. Story, “Concrete crack detection using context-aware deep semantic segmentation network,” Computer-Aided Civil and Infrastructure Engineering, 34(11) (2019) 951–971; https://doi.org/10.1111/mice.12477.
  • Konstan, J.A., Riedl, J. (2012) Recommender systems: from algorithms to user experience , Adapt Interact 22: 101–23 .
  • Pan, C., Li. W. (2010) Research paper recommendation with topic analysis. In Computer Design and Applications IEEE 4, pp V4-264 .
  • Konstan J.A., Miller B.N., Maltz D., Herlocker J.L., Gordon L.R., Riedl J., (1997), Applying collaborative filtering to Usenet news.Commun ACM; 40(3):77-87.
  • Link 1 , (https://www.kaggle.com/datasets) , (Jester Collaborative Filtering Dataset) , (Restoran_tavsiye_sistemi) , (Recommendation System (CF) | Anime ),01.08.2023
  • Link 2, https://github.com/Ramakrishna05/Recommendation-Algorithm, 01.08.2023.
  • Link 3, Web: https://bookdown.org/egarpor/PM-UC3M/lm-ii-dimred.html

Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering

Yıl 2024, Cilt: 4 Sayı: 2, 96 - 110, 30.12.2024
https://doi.org/10.54569/aair.1597930

Öz

Recommendation systems (RS) can be defined as systems that aim to offer personalized product and service recommendations to users based on users' past product preferences and similarities with other users in the system, especially in systems that provide e-commerce services. The main purpose of RS is to reveal meaningful information from large-scale data to users and to recommend systems that aim to simplify the analysis of user behaviors and product attributes. It is possible to divide the techniques used in RS into two main categories content-based and collaborative filtering (CF) according to the information they receive as input. Content-based recommendation systems focus on analyzing the attributes of items such as articles, movies or music to generate tailored recommendations. CF methods analyze user-generated scores for products and services to identify patterns and preferences. The success of CF techniques hinges on accurately identifying user similarities within large datasets. However, in CF techniques, large-scale data sets consisting of a large number of users and the scores given by users to these products are used. Consequently, identifying user similarities in such extensive datasets poses significant challenges. Two different methods are used to overcome this problem. The first method applies clustering analysis to divide the dataset into smaller subsets (user or product), followed by the application of CF techniques. In the other method, dimensionality reduction is performed on a product (object) basis using Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) methods. Up to now, many studies have been carried out using clustering analysis and variable dimensionality reduction methods Despite extensive research, a thorough comparison of clustering and dimensionality reduction methods on real-world datasets remains unexplored. This study aims to compare the performances of eleven clustering techniques of eleven clustering techniques, four of which are non-hierarchical seven of which are hierarchical clustering algorithms, and two variable dimensionality reduction techniques, consisting of SVD and PCA METHODS, in CF.

Kaynakça

  • Cai, D., Wang, X., & He, X. (2009, June). Probabilistic dyadic data analysis with local and global consistency. In Proceedings of the 26th annual international conference on machine learning (pp. 105 112).
  • George T, Merugu S., (2005), A scalable collaborative filtering framework based on co-clustering. In Proc. the 5th IEEE Int. Conf. Data Mining, Nov. pp.625-628.
  • Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
  • Heckerman D., Chickering D., Meek C., Rounthwaite R. and Kadie C., (2001) Dependency networks for inference, collaborative filtering, and data visualization. The Journal of Machine Learning Research, 1:49–75.
  • MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
  • Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
  • Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
  • Bakır, Ç., & Albayrak, S. (2014, April). User based and item based collaborative filtering with temporal dynamics. In 2014 22nd Signal Processing and Communications Applications Conference (Siu) (pp. 252-255). IEEE.
  • Sarwar B., Karypis G., Konstan J. and Riedl J., (2001) Item-based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International Conference on World Wide Web (WWW ’01). ACM, 285– 295. DOI:http://dx.doi.org/10.1145/371920.372071.
  • Şenol, A., Kaya, M. ve Canbay, Y. (2024). Akan veri kümeleme probleminde ağaç veri yapılarının performans karşılaştırması. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi , 39 (1), 217-232.
  • Xu R,Wunsch D., (2005) . Survey Of Clustering Algorithms, IEEE Transactionson Neural Networks, 16(3):645–678.
  • Altinisik, A., Yildirim, U., & Topcu, Y. I. (2022). Evaluation of failure risks for manual tightening operations in automotive assembly lines. Assembly Automation, 42(5), 653-676.
  • Koohi, H., Kiani, K. (2016), User based collaborative filtering using fuzzy c-means, Measurement, 91:134-139.
  • Chen, J., Wang, H., & Yan, Z. (2018). Evolutionary heterogeneous clustering for rating prediction based on user collaborative filtering. Swarm and Evolutionary Computation, 38, 35-41.
  • Liao, C.L., Lee, S.J. (2016) A clustering based approach to improving the efficiency of collaborative filtering recommendation, Electronic Commerce Research and Applications,18:1-9.
  • Ba, J. ve Frey, B. (2013). Derin sinir ağlarını eğitmek için uyarlanabilir bırakma. Sinirsel bilgi işleme sistemlerindeki gelişmeler , 26 .Chicago
  • Hastie,T ,R.Tibshirani and J. Friedman (2009). The Elements Of Statistical Learning: datamining, inference and prediction (2 ed.). Springer, pp 745.
  • Roelofsen, P. (2018), Time Series Clustering, Master Thesis, Vrıje Unıversıteıt, Amsterdam, 83s.
  • MacQueen, J. B., (1967), Some Methods for Classification and Analysis of Multivariate Observations, Proc. Symp. Math. Statist. and Probability (5th), 281– 297.
  • Kaufman, L. ve Rousseeuw, PJ (2009). Verilerde grupları bulma: kümeleme analizine giriş . John Wiley & Sons.
  • Kohonen T. (1995) Learning Vector Quantization. In: Self-Organizing Maps. Springer Series in Information Sciences, vol 30. Springer, Berlin, Heidelberg pp 175-189.
  • Groth, D., Hartmann, S., Klie, S. ve Selbig, J. (2013). Başlıca Bileşenler analizi. Hesaplamalı Toksikoloji: Cilt II, 527-547.
  • X. Zhang, D. Rajan, and B. Story, “Concrete crack detection using context-aware deep semantic segmentation network,” Computer-Aided Civil and Infrastructure Engineering, 34(11) (2019) 951–971; https://doi.org/10.1111/mice.12477.
  • Konstan, J.A., Riedl, J. (2012) Recommender systems: from algorithms to user experience , Adapt Interact 22: 101–23 .
  • Pan, C., Li. W. (2010) Research paper recommendation with topic analysis. In Computer Design and Applications IEEE 4, pp V4-264 .
  • Konstan J.A., Miller B.N., Maltz D., Herlocker J.L., Gordon L.R., Riedl J., (1997), Applying collaborative filtering to Usenet news.Commun ACM; 40(3):77-87.
  • Link 1 , (https://www.kaggle.com/datasets) , (Jester Collaborative Filtering Dataset) , (Restoran_tavsiye_sistemi) , (Recommendation System (CF) | Anime ),01.08.2023
  • Link 2, https://github.com/Ramakrishna05/Recommendation-Algorithm, 01.08.2023.
  • Link 3, Web: https://bookdown.org/egarpor/PM-UC3M/lm-ii-dimred.html
Toplam 29 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Makine Öğrenme (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Özge Taş 0000-0001-7220-5054

Yayımlanma Tarihi 30 Aralık 2024
Gönderilme Tarihi 7 Aralık 2024
Kabul Tarihi 28 Aralık 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 4 Sayı: 2

Kaynak Göster

IEEE Ö. Taş, “Comparison of The Performances of Clustering and Dimensionality Reduction Approaches in Collaborative Filtering”, Adv. Artif. Intell. Res., c. 4, sy. 2, ss. 96–110, 2024, doi: 10.54569/aair.1597930.

Advances in Artificial Intelligence Research is an open access journal which means that the content is freely available without charge to the user or his/her institution. All papers are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows users to distribute, remix, adapt, and build upon the material in any medium or format for non-commercial purposes only, and only so long as attribution is given to the creator.

Graphic design @ Özden Işıktaş