Using Machine Learning Algorithms to Analyze Customer Churn in the Software as a Service (SaaS) Industry
Year 2022,
Volume: 10 Issue: 3, 115 - 123, 30.09.2022
Levent Çallı
,
Sena Kasım
Abstract
Companies must retain their customers and maintain long-term relationships in industries with intense competition. Customer churn analysis is defined in the literature as identifying customers who may leave a company to take appropriate marketing precautions. While customer churn research is prevalent in B2C (Business to Customer) business models such as the telecoms and retail sectors, customer churn analysis in B2B (business to business) models is a relatively emerging topic. In this regard, the study carried out a customer churn analysis by considering an ERP (enterprise resource planning) company with a software as a service (SaaS) business model. Different machine learning algorithms analyzed ten features determined by selection methods and expert opinions. According to the analysis results, the random forest algorithm gave the best result. Additionally, it has been observed that the number of products and customer features has a relatively higher weight for the prediction of churner.
References
- N. Glady, B. Baesens, and C. Croux, “Modeling churn using customer lifetime value,” Eur. J. Oper. Res., vol. 197, no. 1, pp. 402–411, 2009, doi: 10.1016/j.ejor.2008.06.027.
- A. Amin, F. Al-Obeidat, B. Shah, A. Adnan, J. Loo, and S. Anwar, “Customer churn prediction in telecommunication industry using data certainty,” J. Bus. Res., vol. 94, no. February 2018, pp. 290–301, 2019, doi: 10.1016/j.jbusres.2018.03.003.
- J. Ganesh, M. J. Arnold, and K. E. Reynolds, “Understanding the customer base of service providers: An examination of the differences between switchers and stayers,” J. Mark., vol. 64, no. 3, pp. 65–87, 2000, doi: 10.1509/jmkg.64.3.65.18028.
- B. Huang, M. T. Kechadi, and B. Buckley, “Customer churn prediction in telecommunications,” Expert Syst. Appl., vol. 39, no. 1, pp. 1414–1425, 2012, doi: 10.1016/j.eswa.2011.08.024.
- K. Kim, C. H. Jun, and J. Lee, “Improved churn prediction in telecommunication industry by analyzing a large network,” Expert Syst. Appl., vol. 41, no. 15, pp. 6575–6584, 2014, doi: 10.1016/j.eswa.2014.05.014.
- W. Verbeke, K. Dejaeger, D. Martens, J. Hur, and B. Baesens, “New insights into churn prediction in the telecommunication sector: A profit driven data mining approach,” Eur. J. Oper. Res., vol. 218, no. 1, pp. 211–229, 2012, doi: 10.1016/j.ejor.2011.09.031.
- J. H. Ahn, S. P. Han, and Y. S. Lee, “Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry,” Telecomm. Policy, vol. 30, no. 10–11, pp. 552–568, 2006, doi: 10.1016/j.telpol.2006.09.006.
- C. F. Tsai and Y. H. Lu, “Customer churn prediction by hybrid neural networks,” Expert Syst. Appl., vol. 36, no. 10, pp. 12547–12553, 2009, doi: 10.1016/j.eswa.2009.05.032.
- T. Vafeiadis, K. I. Diamantaras, G. Sarigiannidis, and K. C. Chatzisavvas, “A comparison of machine learning techniques for customer churn prediction,” Simul. Model. Pract. Theory, vol. 55, pp. 1–9, 2015, doi: 10.1016/j.simpat.2015.03.003.
- J. Bhattacharyya and M. K. Dash, “What Do We Know About Customer Churn Behaviour in the Telecommunication Industry? A Bibliometric Analysis of Research Trends, 1985–2019,” FIIB Bus. Rev., 2021, doi: 10.1177/23197145211062687.
- Y. Xie, X. Li, E. W. T. Ngai, and W. Ying, “Customer churn prediction using improved balanced random forests,” Expert Syst. Appl., vol. 36, no. 3 PART 1, pp. 5445–5449, 2009, doi: 10.1016/j.eswa.2008.06.121.
- A. Keramati, H. Ghaneei, and S. M. Mirmohammadi, “Developing a prediction model for customer churn from electronic banking services using data mining,” Financ. Innov., vol. 2, no. 1, 2016, doi: 10.1186/s40854-016-0029-6.
- A. Bilal Zoric, “Predicting Customer Churn in Banking Industry using Neural Networks,” Interdiscip. Descr. Complex Syst., vol. 14, no. 2, pp. 116–124, 2016, doi: 10.7906/indecs.14.2.1.
- J. Burez and D. Van den Poel, “Handling class imbalance in customer churn prediction,” Expert Syst. Appl., vol. 36, no. 3 PART 1, pp. 4626–4636, 2009, doi: 10.1016/j.eswa.2008.05.027.
- M. A. de la Llave Montiel and F. López, “Spatial models for online retail churn: Evidence from an online grocery delivery service in Madrid,” Pap. Reg. Sci., vol. 99, no. 6, pp. 1643–1665, 2020, doi: 10.1111/pirs.12552.
- W. Buckinx and D. Van Den Poel, “Customer base analysis: Partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting,” Eur. J. Oper. Res., vol. 164, no. 1, pp. 252–268, 2005, doi: 10.1016/j.ejor.2003.12.010.
- X. Hu, Y. Yang, L. Chen, and S. Zhu, “Research on a Customer Churn Combination Prediction Model Based on Decision Tree and Neural Network,” 2020 IEEE 5th Int. Conf. Cloud Comput. Big Data Anal. ICCCBDA 2020, pp. 129–132, 2020, doi: 10.1109/ICCCBDA49378.2020.9095611.
- B. Janssens, M. Bogaert, A. Bagué, and D. Van den Poel, “B2Boost: instance-dependent profit-driven modelling of B2B churn,” Ann. Oper. Res., 2022, doi: 10.1007/s10479-022-04631-5.
- W. Verbeke, D. Martens, C. Mues, and B. Baesens, “Building comprehensible customer churn prediction models with advanced rule induction techniques,” Expert Syst. Appl., vol. 38, no. 3, pp. 2354–2364, 2011, doi: 10.1016/j.eswa.2010.08.023.
- Y. Ge, S. He, J. Xiong, and D. E. Brown, “Customer churn analysis for a software-as-a-service company,” in 2017 Systems and Information Engineering Design Symposium, SIEDS 2017, 2017, pp. 106–111, doi: 10.1109/SIEDS.2017.7937698.
- A. Rautio, “Churn rediction in SaaS using Machine Learning,” 2019.
- P. Amornvetchayakul and N. Phumchusri, “Customer Churn Prediction for a Software-as-a-Service Inventory Management Software Company: A Case Study in Thailand,” in 2020 IEEE 7th International Conference on Industrial Engineering and Applications, ICIEA 2020, 2020, pp. 514–518, doi: 10.1109/ICIEA49774.2020.9102099.
- M. Sergue, “Customer Churn Analysis and Prediction using Machine Learning for a B2B SaaS company,” 2020, [Online]. Available: www.kth.se/sci.
- D. Ma, “The Business Model of ‘Software-As-A-Service,’” in IEEE International Conference on Services Computing (SCC 2007), 2007, no. July, pp. 701–702, doi: 10.1109/SCC.2007.118.
- Fortunebusinessinsights, “The software as a service market Size,” 2022. https://www.fortunebusinessinsights.com/software-as-a-service-saas-market-102222.
- E. Jones, “Cloud Market Share – a Look at the Cloud Ecosystem in 2022,” KINSTA BLOG, 2022. https://kinsta.com/blog/cloud-market-share/#:~:text=The SaaS market is dominated,impressive annual growth of 34%25.
- E. Jones, “Cloud Market Share – a Look at the Cloud Ecosystem in 2022,” KINSTA BLOG, 2022. .
- K. Allen, “Churn Rate vs Retention Rate: How to Calculate These SaaS KPIs,” woopra.com, 2022. https://www.woopra.com/blog/churn-rate-vs-retention-rate.
- A. Jović, K. Brkić, and N. Bogunović, “A review of feature selection methods with applications,” 2015 38th Int. Conv. Inf. Commun. Technol. Electron. Microelectron. MIPRO 2015 - Proc., no. May, pp. 1200–1205, 2015, doi: 10.1109/MIPRO.2015.7160458.
- A. Field, Discovering Statistics Using IBM SPSS Statistics. SAGE Publications Ltd, 2018.
- sampath kumar Gajawada, “Chi-Square Test for Feature Selection in Machine learning,” https://towardsdatascience.com/, 2019. https://towardsdatascience.com/chi-square-test-for-feature-selection-in-machine-learning-206b1f0b8223.
- N. Tyagi, “What is Information Gain and Gini Index in Decision Trees?,” https://www.analyticssteps.com/, 2021. https://www.analyticssteps.com/blogs/what-gini-index-and-information-gain-decision-trees.
- U. Krčadinac, “Classification – Decision Trees,” 2015. http://ai.fon.bg.ac.rs/wp-content/uploads/2015/04/Classification-Decision-Trees-2015.pdf.
- S. K. Trivedi, “A study on credit scoring modeling with different feature selection and machine learning approaches,” Technol. Soc., vol. 63, no. September, p. 101413, 2020, doi: 10.1016/j.techsoc.2020.101413.
- W. Shang, H. Huang, H. Zhu, Y. Lin, Y. Qu, and Z. Wang, “A novel feature selection algorithm for text categorization,” Expert Syst. Appl., vol. 33, no. 1, pp. 1–5, 2007, doi: 10.1016/j.eswa.2006.04.001.
- J. Demšar et al., “Orange: Data Mining Toolbox in Python,” J. Mach. Learn. Res., vol. 14, pp. 2349–2353, 2013, [Online]. Available: http://jmlr.org/papers/v14/demsar13a.html.
- IBM Corp., “IBM SPSS Statistics for Windows, Version 26.0,” 2019. 2019.
- V. V. Saradhi and G. K. Palshikar, “Employee churn prediction,” Expert Syst. Appl., vol. 38, no. 3, pp. 1999–2006, 2011, doi: 10.1016/j.eswa.2010.07.134.
- S. Sayad, “An Introduction to Data Science,” 2022. saedsayad.com/data_mining_map.htm.
- IBM, “What is the k-nearest neighbors algorithm?,” 2022. https://www.ibm.com/topics/knn.
- O. Kramer, Dimensionality Reduction with Unsupervised Nearest Neighbors, vol. 51. 2013.
- G. Biau and E. Scornet, “A random forest guided tour,” Test, vol. 25, no. 2, pp. 197–227, 2016, doi: 10.1007/s11749-016-
0481-7.
- T. Yiu, “Understanding Random Forest,” https://towardsdatascience.com/, 2019. https://towardsdatascience.com/understanding-random-forest-58381e0602d2.
- D. J. Livingstone, D. T. Manallack, and I. V. Tetko, “Data modelling with neural networks: Advantages and limitations,” J. Comput. Aided. Mol. Des., vol. 11, no. 2, pp. 135–142, 1997, doi: 10.1023/A:1008074223811.
- S. A. Neslin, S. Gupta, W. Kamakura, L. U. Junxiang, and C. H. Mason, “Defection detection: Measuring and understanding the predictive accuracy of customer churn models,” J. Mark. Res., vol. 43, no. 2, pp. 204–211, 2006, doi: 10.1509/jmkr.43.2.204.
Year 2022,
Volume: 10 Issue: 3, 115 - 123, 30.09.2022
Levent Çallı
,
Sena Kasım
References
- N. Glady, B. Baesens, and C. Croux, “Modeling churn using customer lifetime value,” Eur. J. Oper. Res., vol. 197, no. 1, pp. 402–411, 2009, doi: 10.1016/j.ejor.2008.06.027.
- A. Amin, F. Al-Obeidat, B. Shah, A. Adnan, J. Loo, and S. Anwar, “Customer churn prediction in telecommunication industry using data certainty,” J. Bus. Res., vol. 94, no. February 2018, pp. 290–301, 2019, doi: 10.1016/j.jbusres.2018.03.003.
- J. Ganesh, M. J. Arnold, and K. E. Reynolds, “Understanding the customer base of service providers: An examination of the differences between switchers and stayers,” J. Mark., vol. 64, no. 3, pp. 65–87, 2000, doi: 10.1509/jmkg.64.3.65.18028.
- B. Huang, M. T. Kechadi, and B. Buckley, “Customer churn prediction in telecommunications,” Expert Syst. Appl., vol. 39, no. 1, pp. 1414–1425, 2012, doi: 10.1016/j.eswa.2011.08.024.
- K. Kim, C. H. Jun, and J. Lee, “Improved churn prediction in telecommunication industry by analyzing a large network,” Expert Syst. Appl., vol. 41, no. 15, pp. 6575–6584, 2014, doi: 10.1016/j.eswa.2014.05.014.
- W. Verbeke, K. Dejaeger, D. Martens, J. Hur, and B. Baesens, “New insights into churn prediction in the telecommunication sector: A profit driven data mining approach,” Eur. J. Oper. Res., vol. 218, no. 1, pp. 211–229, 2012, doi: 10.1016/j.ejor.2011.09.031.
- J. H. Ahn, S. P. Han, and Y. S. Lee, “Customer churn analysis: Churn determinants and mediation effects of partial defection in the Korean mobile telecommunications service industry,” Telecomm. Policy, vol. 30, no. 10–11, pp. 552–568, 2006, doi: 10.1016/j.telpol.2006.09.006.
- C. F. Tsai and Y. H. Lu, “Customer churn prediction by hybrid neural networks,” Expert Syst. Appl., vol. 36, no. 10, pp. 12547–12553, 2009, doi: 10.1016/j.eswa.2009.05.032.
- T. Vafeiadis, K. I. Diamantaras, G. Sarigiannidis, and K. C. Chatzisavvas, “A comparison of machine learning techniques for customer churn prediction,” Simul. Model. Pract. Theory, vol. 55, pp. 1–9, 2015, doi: 10.1016/j.simpat.2015.03.003.
- J. Bhattacharyya and M. K. Dash, “What Do We Know About Customer Churn Behaviour in the Telecommunication Industry? A Bibliometric Analysis of Research Trends, 1985–2019,” FIIB Bus. Rev., 2021, doi: 10.1177/23197145211062687.
- Y. Xie, X. Li, E. W. T. Ngai, and W. Ying, “Customer churn prediction using improved balanced random forests,” Expert Syst. Appl., vol. 36, no. 3 PART 1, pp. 5445–5449, 2009, doi: 10.1016/j.eswa.2008.06.121.
- A. Keramati, H. Ghaneei, and S. M. Mirmohammadi, “Developing a prediction model for customer churn from electronic banking services using data mining,” Financ. Innov., vol. 2, no. 1, 2016, doi: 10.1186/s40854-016-0029-6.
- A. Bilal Zoric, “Predicting Customer Churn in Banking Industry using Neural Networks,” Interdiscip. Descr. Complex Syst., vol. 14, no. 2, pp. 116–124, 2016, doi: 10.7906/indecs.14.2.1.
- J. Burez and D. Van den Poel, “Handling class imbalance in customer churn prediction,” Expert Syst. Appl., vol. 36, no. 3 PART 1, pp. 4626–4636, 2009, doi: 10.1016/j.eswa.2008.05.027.
- M. A. de la Llave Montiel and F. López, “Spatial models for online retail churn: Evidence from an online grocery delivery service in Madrid,” Pap. Reg. Sci., vol. 99, no. 6, pp. 1643–1665, 2020, doi: 10.1111/pirs.12552.
- W. Buckinx and D. Van Den Poel, “Customer base analysis: Partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting,” Eur. J. Oper. Res., vol. 164, no. 1, pp. 252–268, 2005, doi: 10.1016/j.ejor.2003.12.010.
- X. Hu, Y. Yang, L. Chen, and S. Zhu, “Research on a Customer Churn Combination Prediction Model Based on Decision Tree and Neural Network,” 2020 IEEE 5th Int. Conf. Cloud Comput. Big Data Anal. ICCCBDA 2020, pp. 129–132, 2020, doi: 10.1109/ICCCBDA49378.2020.9095611.
- B. Janssens, M. Bogaert, A. Bagué, and D. Van den Poel, “B2Boost: instance-dependent profit-driven modelling of B2B churn,” Ann. Oper. Res., 2022, doi: 10.1007/s10479-022-04631-5.
- W. Verbeke, D. Martens, C. Mues, and B. Baesens, “Building comprehensible customer churn prediction models with advanced rule induction techniques,” Expert Syst. Appl., vol. 38, no. 3, pp. 2354–2364, 2011, doi: 10.1016/j.eswa.2010.08.023.
- Y. Ge, S. He, J. Xiong, and D. E. Brown, “Customer churn analysis for a software-as-a-service company,” in 2017 Systems and Information Engineering Design Symposium, SIEDS 2017, 2017, pp. 106–111, doi: 10.1109/SIEDS.2017.7937698.
- A. Rautio, “Churn rediction in SaaS using Machine Learning,” 2019.
- P. Amornvetchayakul and N. Phumchusri, “Customer Churn Prediction for a Software-as-a-Service Inventory Management Software Company: A Case Study in Thailand,” in 2020 IEEE 7th International Conference on Industrial Engineering and Applications, ICIEA 2020, 2020, pp. 514–518, doi: 10.1109/ICIEA49774.2020.9102099.
- M. Sergue, “Customer Churn Analysis and Prediction using Machine Learning for a B2B SaaS company,” 2020, [Online]. Available: www.kth.se/sci.
- D. Ma, “The Business Model of ‘Software-As-A-Service,’” in IEEE International Conference on Services Computing (SCC 2007), 2007, no. July, pp. 701–702, doi: 10.1109/SCC.2007.118.
- Fortunebusinessinsights, “The software as a service market Size,” 2022. https://www.fortunebusinessinsights.com/software-as-a-service-saas-market-102222.
- E. Jones, “Cloud Market Share – a Look at the Cloud Ecosystem in 2022,” KINSTA BLOG, 2022. https://kinsta.com/blog/cloud-market-share/#:~:text=The SaaS market is dominated,impressive annual growth of 34%25.
- E. Jones, “Cloud Market Share – a Look at the Cloud Ecosystem in 2022,” KINSTA BLOG, 2022. .
- K. Allen, “Churn Rate vs Retention Rate: How to Calculate These SaaS KPIs,” woopra.com, 2022. https://www.woopra.com/blog/churn-rate-vs-retention-rate.
- A. Jović, K. Brkić, and N. Bogunović, “A review of feature selection methods with applications,” 2015 38th Int. Conv. Inf. Commun. Technol. Electron. Microelectron. MIPRO 2015 - Proc., no. May, pp. 1200–1205, 2015, doi: 10.1109/MIPRO.2015.7160458.
- A. Field, Discovering Statistics Using IBM SPSS Statistics. SAGE Publications Ltd, 2018.
- sampath kumar Gajawada, “Chi-Square Test for Feature Selection in Machine learning,” https://towardsdatascience.com/, 2019. https://towardsdatascience.com/chi-square-test-for-feature-selection-in-machine-learning-206b1f0b8223.
- N. Tyagi, “What is Information Gain and Gini Index in Decision Trees?,” https://www.analyticssteps.com/, 2021. https://www.analyticssteps.com/blogs/what-gini-index-and-information-gain-decision-trees.
- U. Krčadinac, “Classification – Decision Trees,” 2015. http://ai.fon.bg.ac.rs/wp-content/uploads/2015/04/Classification-Decision-Trees-2015.pdf.
- S. K. Trivedi, “A study on credit scoring modeling with different feature selection and machine learning approaches,” Technol. Soc., vol. 63, no. September, p. 101413, 2020, doi: 10.1016/j.techsoc.2020.101413.
- W. Shang, H. Huang, H. Zhu, Y. Lin, Y. Qu, and Z. Wang, “A novel feature selection algorithm for text categorization,” Expert Syst. Appl., vol. 33, no. 1, pp. 1–5, 2007, doi: 10.1016/j.eswa.2006.04.001.
- J. Demšar et al., “Orange: Data Mining Toolbox in Python,” J. Mach. Learn. Res., vol. 14, pp. 2349–2353, 2013, [Online]. Available: http://jmlr.org/papers/v14/demsar13a.html.
- IBM Corp., “IBM SPSS Statistics for Windows, Version 26.0,” 2019. 2019.
- V. V. Saradhi and G. K. Palshikar, “Employee churn prediction,” Expert Syst. Appl., vol. 38, no. 3, pp. 1999–2006, 2011, doi: 10.1016/j.eswa.2010.07.134.
- S. Sayad, “An Introduction to Data Science,” 2022. saedsayad.com/data_mining_map.htm.
- IBM, “What is the k-nearest neighbors algorithm?,” 2022. https://www.ibm.com/topics/knn.
- O. Kramer, Dimensionality Reduction with Unsupervised Nearest Neighbors, vol. 51. 2013.
- G. Biau and E. Scornet, “A random forest guided tour,” Test, vol. 25, no. 2, pp. 197–227, 2016, doi: 10.1007/s11749-016-
0481-7.
- T. Yiu, “Understanding Random Forest,” https://towardsdatascience.com/, 2019. https://towardsdatascience.com/understanding-random-forest-58381e0602d2.
- D. J. Livingstone, D. T. Manallack, and I. V. Tetko, “Data modelling with neural networks: Advantages and limitations,” J. Comput. Aided. Mol. Des., vol. 11, no. 2, pp. 135–142, 1997, doi: 10.1023/A:1008074223811.
- S. A. Neslin, S. Gupta, W. Kamakura, L. U. Junxiang, and C. H. Mason, “Defection detection: Measuring and understanding the predictive accuracy of customer churn models,” J. Mark. Res., vol. 43, no. 2, pp. 204–211, 2006, doi: 10.1509/jmkr.43.2.204.