Sales History-based Demand Prediction using Generalized Linear Models
Öz
It's vital for commercial enterprises to accurately predict demand by utilizing the existing sales data. Such predictive analytics is a crucial part of their decision support systems to increase the profitability of the company.
In predictive data analytics, the branch of regression modeling is used to predict a numerical response variable like sale amount. In this category, linear models are simple and easy to interpret yet they permit generalization to very powerful and flexible families of models which are called Generalized linear models (GLM). The generalization potential over simple linear regression can be explained twofold: First, GLM relax the assumption of normally distributed error terms. Moreover, the relationship of the set of predictor variables and the response variable could be represented by a set of link functions rather than the sole choice of the identity function.
This work models the sales amount prediction problem through the use of GLM. Unique company sales data are explored and the response variable, sale amount is fitted to the Gamma distribution. Then, inverse link function, which is the canonical one in the case of gamma-distributed response variable is used. The experimental results are compared with the other regression models and the classification algorithms. The model selection is performed via the use of MSE and AIC metrics respectively. The results show that GLM is better than the linear regression. As for the classification algorithms, Random Forest and GLM are the top performers. Moreover, categorization on the predictor variables improves model fitting results significantly.
Anahtar Kelimeler
Kaynakça
- [1] Nelder, J.A., Wedderburn, R.W.M. 1972. Generalized linear models. Journal of the Royal Statistical Society, Series A, General, 135, 370-384.
- [2] Razzaghi, M. 2013. The Probit Link Function in Generalized Linear Models for Data Mining Applications. Journal of Modern Applied Statistical Methods, 12(19), 164-169.
- [3] Tauras, J.A. 2005. An Empirical Analysis of Adult Cigarette Demand. Eastern Economic Journal, 31(3), 361-375.
- [4] The Odum Institute, 2015. Logistic Regression and the American National Election Study 2012: Vote Choice in the 2012 US Presidential Election. The Odum Institute.
- [5] Kutner, M.H., Nachtsheim, C., Neter, J. 2004. Applied linear regression models. McGraw-Hill/Irwin.
- [6] Johnson, P. 2006. GLM with Gamma-Distributed Dependent Variables (Access Date: 28.05.2018.
- [7] Friedman, J., Hastie, T., Tibshirani, R. 2010. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, Articles, 33(1), 1-22.
- [8] Schapire, R.E., Freund, Y. 2012. Boosting: Foundations and Algorithms. MIT Press.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Mühendislik
Bölüm
Araştırma Makalesi
Yazarlar
Başar Özenboy
Bu kişi benim
0000-0001-9809-7354
Türkiye
Selma Tekir
*
0000-0002-0488-9682
Türkiye
Yayımlanma Tarihi
25 Aralık 2019
Gönderilme Tarihi
28 Nisan 2019
Kabul Tarihi
18 Eylül 2019
Yayımlandığı Sayı
Yıl 2019 Cilt: 23 Sayı: 3