Analysing Content Ratings of Google Apps with Ensemble Learning

Ercan Atagün; Tunahan Timuçin; Serdar Biroğul

doi:10.31202/ecjse.1059822

Research Article

Topluluk Öğrenme ile Google Uygulamalarının İçerik Derecelendirmelerini Analiz Etme

Year 2022, Volume: 9 Issue: 3, 1038 - 1050, 30.09.2022

Ercan Atagün , Tunahan Timuçin , Serdar Biroğul

https://doi.org/10.31202/ecjse.1059822

Abstract

Android Market ismiyle piyasaya çıktıktan sonra Google Play ismiyle ününü tüm dünyaya duyuran, Google’ın Android kullanıcıları için geliştirdiği bir paket yöneticisi olan uygulama marketi, içerisinde birçok alana ve yaş aralığına hitap eden uygulamalar bulundurmaktadır. Uygulamaların yayıldığı geniş çerçeve ve “büyük veri” olarak adlandırılma seviyesine ulaşmış olan veri akışı, araştırmacıların dikkatini de çekmeye başlamıştır. Uygulama sayısındaki aşırı artış ebeveynlerin içerikler konusunda takibini zorlaştırmaktadır. Google Play üzerindeki uygulamaların içerik kontrolünün (content rating) sağlanabilmesi için makine öğrenmesi yöntemleri ile sınıflandırılmasına ihtiyaç duyulmaktadır. Bu çalışmada Google Play üzerindeki 10757 uygulamanın Category, Rating, Reviews, Size, Installs, Type, Genres, Last Updated, Current Version, Android Version özellikleri, Ensemble Learning yöntemleri (Adaboost, Bagging, Random Forest, Stacking), K-Nearest Neighbors, Logistic Regression ve Yapay Sinir Ağı algoritmaları ile analiz edilerek content rating sınıflandırılması yapılmıştır.

Keywords

topluluk öğrenme , sınıflandırma , içerik derecelendirme , google uygulamaları

References

Maredia, R. Analysis of Google Play Store Data set and predict the populari-ty of an app on Google Play Store.
Wang, H., Li, H., Li, L., Guo, Y., & Xu, G. (2018, May). Why are android apps re-moved from google play? a large-scale empirical study. In 2018 IEEE/ACM 15th Inter-national Conference on Mining Software Repositories (MSR) (pp. 231-242). IEEE.
Mueez, A., Ahmed, K., Islam, T., & Iqbal, W. (2018). Exploratory data analysis and success prediction of Google Play Store apps (Doctoral dissertation, BRAC Universi-ty).
Kılınç, M., Tarhan, Ç., & Aydın, C. (2020). Could Mobile Applications' Success be In-creased via Machine Learning and Business Intelligence Methods?. Avrupa Bilim ve Teknoloji Dergisi, (20), 805-814.
Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021). Dis-crepancy detection between actual user reviews and numeric ratings of Google App store using deep learning. Expert Systems with Applications, 181, 115111.
Umer, M., Ashraf, I., Mehmood, A., Ullah, S., & Choi, G. S. (2021). Predicting nu-meric ratings for Google apps using text features and ensemble learning. ETRI Journal, 43(1), 95-108.
Bashir, G. M. M., Hossen, M. S., Karmoker, D., & Kamal, M. J. (2019, December). Android apps success prediction before uploading on google play store. In 2019 Inter-national Conference on Sustainable Technologies for Industry 4.0 (STI) (pp. 1-6). IEEE.
AmanUllah, H., Fatima, M., Muneer, U., Ilyas, S., Rehman, R. A., & Afzal, I. Causal Impact Analysis on Android Market.
Garg, M., Monga, A., Bhatt, P., & Arora, A. (2016, December). Android app behav-iour classification using topic modeling techniques and outlier detection using app per-missions. In 2016 Fourth International Conference on Parallel, Distributed and Grid Computing (PDGC) (pp. 500-506). IEEE.
Magar, B. T., Mali, S., & Abdelfattah, E. (2021, January). App Success Classification Using Machine Learning Models. In 2021 IEEE 11th Annual Computing and Commu-nication Workshop and Conference (CCWC) (pp. 0642-0647). IEEE.
Shaw, E., Shaw, A., & Umphress, D. (2014, November). Mining android apps to pre-dict market ratings. In 6th International Conference on Mobile Computing, Applications and Services (pp. 166-167). IEEE.
Kaboha, N., Bani Hani, J., Seigneur, J. M., & Choukou, M. A. (2021, May). The Role of Technology in Senior Co-Caregiving Support: A Scoping Review of Senior Care Mobile Applications. In 12th Augmented Human International Conference (pp. 1-2).
Ahmed, I., Ahmad, N. S., Ali, S., Ali, S., George, A., Danish, H. S., ... & Darzi, A. (2018). Medication adherence apps: review and content analysis. JMIR mHealth and uHealth, 6(3), e6432
Sambhi, R. D., Kalaichandran, R., & Tan, J. (2019). Critical analysis of features and quality of applications for clinical management of acne. Dermatology online journal, 25(10).
Savic, M., Best, D., Rodda, S., & Lubman, D. I. (2013). Exploring the focus and expe-riences of smartphone applications for addiction recovery. Journal of addictive diseases, 32(3), 310-319
Krishnan, G., & Selvam, G. (2019). Factors influencing the download of mobile health apps: Content review-led regression analysis. Health Policy and Technology, 8(4), 356-364.
Biviji, R., Vest, J. R., Dixon, B. E., Cullen, T., & Harle, C. A. (2020). Factors related to user ratings and user downloads of mobile apps for maternal and infant health: Cross-sectional study. JMIR mHealth and uHealth, 8(1), e15663.
Ayyaswami, V., Padmanabhan, D. L., Crihalmeanu, T., Thelmo, F., Prabhu, A. V., & Magnani, J. W. (2019). Mobile health applications for atrial fibrillation: a readability and quality assessment. International journal of cardiology, 293, 288-293.
Chyjek, K., Farag, S., & Chen, K. T. (2015). Rating pregnancy wheel applications us-ing the APPLICATIONS scoring system. Obstetrics & Gynecology, 125(6), 1478-1483.
Frie, K., Hartmann-Boyce, J., Jebb, S., Albury, C., Nourse, R., & Aveyard, P. (2017). Insights from Google Play Store User Reviews for the Development of Weight Loss Apps: An App Market Review. JMIR mHealth and uHealth, 5(12).
Takawale, H. C., & Thakur, A. (2018, October). Talos app: On-device machine learn-ing using tensorflow to detect android malware. In 2018 Fifth International Conference on Internet of Things: Systems, Management and Security (pp. 250-255). IEEE.
Garg, S., & Baliyan, N. (2019). Data on vulnerability detection in android. Data in brief, 22, 1081-1087.
Mealings, K., & Beach, E. F. (2020). A content analysis of behaviour change tech-niques in noise monitoring apps.
Siddiqui, N. R., Hodges, S., & Sharif, M. O. (2019). Availability of orthodontic smartphone apps. Journal of orthodontics, 46(3), 235-241.
McIlroy, S., Ali, N., Khalid, H., & Hassan, A. E. (2016). Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Soft-ware Engineering, 21(3), 1067-1106.
Kishore Kolakaluri, D. R. & Mooramreddy Sreedevi. Classification Of Google Playstore Apps Using Knn & Svm.
Meacham, M. C., Vogel, E. A., & Thrul, J. (2020). Vaping-Related Mobile Apps Available in the Google Play Store After the Apple Ban: Content Review. Journal of medical Internet research, 22(11), e20009.
Mahmood, A. (2020). Identifying the influence of various factor of apps on google play apps ratings. Journal of Data, Information and Management, 2(1), 15-23.
Malavolta, I., Ruberto, S., Soru, T., & Terragni, V. (2015, May). Hybrid mobile apps in the google play store: An exploratory investigation. In 2015 2nd ACM international conference on mobile software engineering and systems (pp. 56-59). IEEE.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
Aytuğ, Onan, Haber Metinlerinden Sosyo-ekonomik ve Epidemiyolojik Konuların Metin Madenciliğine Dayalı Belirlenmesi. Avrupa Bilim ve Teknoloji Dergisi, (26), 295-300.
Schapire, R. E. (2013). Explaining adaboost. In Empirical inference (pp. 37-52). Springer, Berlin, Heidelberg.
Freund Y, Schapire RE. “A Decision-theoretic generalization of on-line learning and an application to boosting”. Journal of Computer and System Sciences, 55(1), 119-139, 1997.
Kalaycı, T. E. (2018). Kimlik hırsızı web sitelerinin sınıflandırılması için makine öğrenmesi yöntemlerinin karşılaştırılması. Pamukkale Üniversitesi Mühendislik Bilim-leri Dergisi, 24(5), 870-878.
Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
Schwenk, H., & Bengio, Y. (1998). Training methods for adaptive boosting of neural networks for character recognition. Advances in neural information processing sys-tems, 10, 647-653.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32
Wikipedia. (2021, July 1). RandomForest. [Online].Available: https://en.wikipedia.org/wiki/Randomforest.
Timuçin, T., & Argun, İ. D. Initial Seed Value Effectiveness on Performances of Data Mining Algorithms. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 9(2), 555-567.
Doğaner, A., & Kirişçi, M. CLASSIFICATION OF CORONARY ARTERY DISEASES USING STACKING ENSEMBLE LEARNING METHOD. The Journal of Cognitive Systems, 5(2), 69-73.
Lanes, M., Schiavo, P. F., Pereira Jr, S. F., Borges, E. N., & Galante, R. (2017, April). An Analysis of the Impact of Diversity on Stacking Supervised Classifiers. In ICEIS (1) (pp. 233-240).
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
Kleinbaum, D. G., Dietz, K., Gail, M., Klein, M., & Klein, M. (2002). Logistic regres-sion. New York: Springer-Verlag.
Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, com-puting, design, and application. Journal of microbiological methods, 43(1), 3-31.
Lavanya. (2021, July 10). Google Play Store Apps [Online]. Available: https://www.kaggle.com/lava18/google-play-store-apps.

Analysing Content Ratings of Google Apps with Ensemble Learning

Year 2022, Volume: 9 Issue: 3, 1038 - 1050, 30.09.2022

Ercan Atagün , Tunahan Timuçin , Serdar Biroğul

https://doi.org/10.31202/ecjse.1059822

Abstract

Google Play was launched under the name of Android Market and made its reputation known all over the world. The mobile application market, which is a package manager developed by Google for Android users, contains applications that appeal to many areas and age ranges. The wide area in which applications spread and the data flow, which has reached the level of being called “big data”, has started to attract the attention of researchers. The excessive increase in the number of applications makes it difficult for parents to follow up on the content. In order to provide content rating of applications on Google Play, it is needed to be classified by machine learning methods. In this study, content rating classification was made by analyzing “Category, Rating, Reviews, Size, Installs, Type, Genres, Last Updated, Current Version, Android Version” features of 10757 applications on Google Play, Ensemble Learning methods (Adaboost, Bagging, Random Forest, Stacking), Logistic Regression, Artificial Neural Network, K-Nearest Neighbors algorithms.

Keywords

Ensemble learning , classification , content rating

References

Maredia, R. Analysis of Google Play Store Data set and predict the populari-ty of an app on Google Play Store.
Wang, H., Li, H., Li, L., Guo, Y., & Xu, G. (2018, May). Why are android apps re-moved from google play? a large-scale empirical study. In 2018 IEEE/ACM 15th Inter-national Conference on Mining Software Repositories (MSR) (pp. 231-242). IEEE.
Mueez, A., Ahmed, K., Islam, T., & Iqbal, W. (2018). Exploratory data analysis and success prediction of Google Play Store apps (Doctoral dissertation, BRAC Universi-ty).
Kılınç, M., Tarhan, Ç., & Aydın, C. (2020). Could Mobile Applications' Success be In-creased via Machine Learning and Business Intelligence Methods?. Avrupa Bilim ve Teknoloji Dergisi, (20), 805-814.
Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021). Dis-crepancy detection between actual user reviews and numeric ratings of Google App store using deep learning. Expert Systems with Applications, 181, 115111.
Umer, M., Ashraf, I., Mehmood, A., Ullah, S., & Choi, G. S. (2021). Predicting nu-meric ratings for Google apps using text features and ensemble learning. ETRI Journal, 43(1), 95-108.
Bashir, G. M. M., Hossen, M. S., Karmoker, D., & Kamal, M. J. (2019, December). Android apps success prediction before uploading on google play store. In 2019 Inter-national Conference on Sustainable Technologies for Industry 4.0 (STI) (pp. 1-6). IEEE.
AmanUllah, H., Fatima, M., Muneer, U., Ilyas, S., Rehman, R. A., & Afzal, I. Causal Impact Analysis on Android Market.
Garg, M., Monga, A., Bhatt, P., & Arora, A. (2016, December). Android app behav-iour classification using topic modeling techniques and outlier detection using app per-missions. In 2016 Fourth International Conference on Parallel, Distributed and Grid Computing (PDGC) (pp. 500-506). IEEE.
Magar, B. T., Mali, S., & Abdelfattah, E. (2021, January). App Success Classification Using Machine Learning Models. In 2021 IEEE 11th Annual Computing and Commu-nication Workshop and Conference (CCWC) (pp. 0642-0647). IEEE.
Shaw, E., Shaw, A., & Umphress, D. (2014, November). Mining android apps to pre-dict market ratings. In 6th International Conference on Mobile Computing, Applications and Services (pp. 166-167). IEEE.
Kaboha, N., Bani Hani, J., Seigneur, J. M., & Choukou, M. A. (2021, May). The Role of Technology in Senior Co-Caregiving Support: A Scoping Review of Senior Care Mobile Applications. In 12th Augmented Human International Conference (pp. 1-2).
Ahmed, I., Ahmad, N. S., Ali, S., Ali, S., George, A., Danish, H. S., ... & Darzi, A. (2018). Medication adherence apps: review and content analysis. JMIR mHealth and uHealth, 6(3), e6432
Sambhi, R. D., Kalaichandran, R., & Tan, J. (2019). Critical analysis of features and quality of applications for clinical management of acne. Dermatology online journal, 25(10).
Savic, M., Best, D., Rodda, S., & Lubman, D. I. (2013). Exploring the focus and expe-riences of smartphone applications for addiction recovery. Journal of addictive diseases, 32(3), 310-319
Krishnan, G., & Selvam, G. (2019). Factors influencing the download of mobile health apps: Content review-led regression analysis. Health Policy and Technology, 8(4), 356-364.
Biviji, R., Vest, J. R., Dixon, B. E., Cullen, T., & Harle, C. A. (2020). Factors related to user ratings and user downloads of mobile apps for maternal and infant health: Cross-sectional study. JMIR mHealth and uHealth, 8(1), e15663.
Ayyaswami, V., Padmanabhan, D. L., Crihalmeanu, T., Thelmo, F., Prabhu, A. V., & Magnani, J. W. (2019). Mobile health applications for atrial fibrillation: a readability and quality assessment. International journal of cardiology, 293, 288-293.
Chyjek, K., Farag, S., & Chen, K. T. (2015). Rating pregnancy wheel applications us-ing the APPLICATIONS scoring system. Obstetrics & Gynecology, 125(6), 1478-1483.
Frie, K., Hartmann-Boyce, J., Jebb, S., Albury, C., Nourse, R., & Aveyard, P. (2017). Insights from Google Play Store User Reviews for the Development of Weight Loss Apps: An App Market Review. JMIR mHealth and uHealth, 5(12).
Takawale, H. C., & Thakur, A. (2018, October). Talos app: On-device machine learn-ing using tensorflow to detect android malware. In 2018 Fifth International Conference on Internet of Things: Systems, Management and Security (pp. 250-255). IEEE.
Garg, S., & Baliyan, N. (2019). Data on vulnerability detection in android. Data in brief, 22, 1081-1087.
Mealings, K., & Beach, E. F. (2020). A content analysis of behaviour change tech-niques in noise monitoring apps.
Siddiqui, N. R., Hodges, S., & Sharif, M. O. (2019). Availability of orthodontic smartphone apps. Journal of orthodontics, 46(3), 235-241.
McIlroy, S., Ali, N., Khalid, H., & Hassan, A. E. (2016). Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empirical Soft-ware Engineering, 21(3), 1067-1106.
Kishore Kolakaluri, D. R. & Mooramreddy Sreedevi. Classification Of Google Playstore Apps Using Knn & Svm.
Meacham, M. C., Vogel, E. A., & Thrul, J. (2020). Vaping-Related Mobile Apps Available in the Google Play Store After the Apple Ban: Content Review. Journal of medical Internet research, 22(11), e20009.
Mahmood, A. (2020). Identifying the influence of various factor of apps on google play apps ratings. Journal of Data, Information and Management, 2(1), 15-23.
Malavolta, I., Ruberto, S., Soru, T., & Terragni, V. (2015, May). Hybrid mobile apps in the google play store: An exploratory investigation. In 2015 2nd ACM international conference on mobile software engineering and systems (pp. 56-59). IEEE.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
Aytuğ, Onan, Haber Metinlerinden Sosyo-ekonomik ve Epidemiyolojik Konuların Metin Madenciliğine Dayalı Belirlenmesi. Avrupa Bilim ve Teknoloji Dergisi, (26), 295-300.
Schapire, R. E. (2013). Explaining adaboost. In Empirical inference (pp. 37-52). Springer, Berlin, Heidelberg.
Freund Y, Schapire RE. “A Decision-theoretic generalization of on-line learning and an application to boosting”. Journal of Computer and System Sciences, 55(1), 119-139, 1997.
Kalaycı, T. E. (2018). Kimlik hırsızı web sitelerinin sınıflandırılması için makine öğrenmesi yöntemlerinin karşılaştırılması. Pamukkale Üniversitesi Mühendislik Bilim-leri Dergisi, 24(5), 870-878.
Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
Schwenk, H., & Bengio, Y. (1998). Training methods for adaptive boosting of neural networks for character recognition. Advances in neural information processing sys-tems, 10, 647-653.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32
Wikipedia. (2021, July 1). RandomForest. [Online].Available: https://en.wikipedia.org/wiki/Randomforest.
Timuçin, T., & Argun, İ. D. Initial Seed Value Effectiveness on Performances of Data Mining Algorithms. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 9(2), 555-567.
Doğaner, A., & Kirişçi, M. CLASSIFICATION OF CORONARY ARTERY DISEASES USING STACKING ENSEMBLE LEARNING METHOD. The Journal of Cognitive Systems, 5(2), 69-73.
Lanes, M., Schiavo, P. F., Pereira Jr, S. F., Borges, E. N., & Galante, R. (2017, April). An Analysis of the Impact of Diversity on Stacking Supervised Classifiers. In ICEIS (1) (pp. 233-240).
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
Kleinbaum, D. G., Dietz, K., Gail, M., Klein, M., & Klein, M. (2002). Logistic regres-sion. New York: Springer-Verlag.
Basheer, I. A., & Hajmeer, M. (2000). Artificial neural networks: fundamentals, com-puting, design, and application. Journal of microbiological methods, 43(1), 3-31.
Lavanya. (2021, July 10). Google Play Store Apps [Online]. Available: https://www.kaggle.com/lava18/google-play-store-apps.

There are 45 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Research Article
Authors	Ercan Atagün 0000-0001-5196-5732 Tunahan Timuçin 0000-0003-0332-4118 Serdar Biroğul 0000-0003-4966-5970
Publication Date	September 30, 2022
Submission Date	January 18, 2022
Acceptance Date	July 3, 2022
Published in Issue	Year 2022 Volume: 9 Issue: 3

Cite

IEEE	E. Atagün, T. Timuçin, and S. Biroğul, “Analysing Content Ratings of Google Apps with Ensemble Learning”, El-Cezeri Journal of Science and Engineering, vol. 9, no. 3, pp. 1038–1050, 2022, doi: 10.31202/ecjse.1059822.

Download Cover Image

Article Files

Full Text

Creative Commons License El-Cezeri is licensed to the public under a Creative Commons Attribution 4.0 license.