Finding Influencers on Twitter with Using Machine Learning Classification Algorithms
Abstract
Microblog sites are environments where people follow people. With this feature, a microblog site is a convenient environment for spreading an opinion or introducing a new product. The key point is determination of individuals who maximize the spreading. This problem is known as Influence Maximization (IM) and has attracted attention of many researchers. Many studies in the literature have modeled IM problem on graphs for different propagation models such as Independent Cascade (IC) and Linear Threshold (LT). However, microblogs like Twitter have their own features. Many works on IM in Twitter derive new metrics from user and tweet features; apply a greedy approach for selecting influencers. In this study, we adopted different approach for IM problem, and we dealt it as a classification problem. Firstly, we collected data on International Women Day 2018; empirically we labeled the users as either influencer candidates or non-influencers; then we applied classification methods for classifying users into one class with using features of users. By this way, we obtained an influencer candidates set, which is very smaller than entire dataset. Experimental results show that making selection with using same heuristic (namely MF) from the reduced influencer candidates set outperforms making selection from entire dataset.
Keywords
Influence Maximization,Twitter,Social Networks,Microblog,Classification
References
- [1] M. Cha, H. Haddai, F. Benevenuto, and K. P. Gummadi, “Measuring User Influence in Twitter: The Million Follower Fallacy,” in International AAAI Conference on Weblogs and Social Media 2010 (ICWSM-10), 2010, pp. 10–17.
- [2] L. Cui et al., “DDSE: A novel evolutionary algorithm based on degree-descending search strategy for influence maximization in social networks,” J. Netw. Comput. Appl., vol. 103, no. September 2017, pp. 119–130, 2018.
- [3] P. Domingos and M. Richardson, “Mining the network value of customers,” in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’01, 2001, pp. 57–66.
- [4] D. Kempe, J. Kleinberg, and É. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’03, 2003, p. 137.
- [5] D. Li, C. Wang, S. Zhang, G. Zhou, D. Chu, and C. Wu, “Positive influence maximization in signed social networks based on simulated annealing,” Neurocomputing, vol. 260, pp. 69–78, 2017.
- [6] L. Liu, J. Tang, J. Han, and S. Yang, “Learning influence from heterogeneous social networks,” Data Min. Knowl. Discov., vol. 25, no. 3, pp. 511–544, 2012.
- [7] J. S. More and C. Lingam, “A SI model for social media influencer maximization,” Appl. Comput. Informatics, 2017.
- [8] M. Richardson and P. Domingos, “Mining knowledge-sharing sites for viral marketing,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’02, 2002, p. 61.
- [9] Y. Zeng, X. Chen, G. Cong, S. Qin, J. Tang, and Y. Xiang, “Maximizing influence under influence loss constraint in social networks,” Expert Syst. Appl., vol. 55, pp. 255–267, 2016.
- [10] F. Li and T. C. Du, “Maximizing micro-blog influence in online promotion,” Expert Syst. Appl., vol. 70, pp. 52–66, 2017.
