Research Article
BibTex RIS Cite

Hanehalkı Tüketim Harcamalarının Mikroekonometrik Analizi: LAD-LASSO Yöntemi

Year 2020, Issue: 33, 13 - 31, 15.01.2021
https://doi.org/10.26650/ekoist.2020.33.843564

Abstract

Bu çalışmanın amacı, denetimli makine öğrenmesi yöntemlerinin aşırı değer ve uzun kuyruklu hatalara sahip Hanehalkı Bütçe Anketi Hane veri setinin ilgili değişkenlerini seçmemize nasıl yardımcı olduğunu incelemek ve Türkiye’nin Hanehalkı TüketimHarcamaları’nın tahmininde en iyitahmin ve öngörü performansına sahip olanmodelin belirlenmesinisağlamaktır. Bu amaçla, 2018 yılı Türkiye’nin Hanehalkı Bütçe Anketi Hane veri seti klasik regresyon yönteminin yanı sıra En Küçük Mutlak Sapma (LAD), En Küçük Mutlak Küçültme ve Seçim Operatörü (LASSO) ve LAD-LASSO yöntemleri kullanılarak incelenmiş ve yöntemlerin tahmin ve öngörü performansları karşılaştırılmıştır. Analiz sonuçlarına göre; uzun kuyruklu hataların varlığında dayanıklı tahminciler elde edilirken aynı zamanda değişken seçimine olanak sağlayan LAD-LASSO makine öğrenmesi yönteminin tahmin performansı ve öngörü açıklığı açısından en başarılı yöntem olduğu sonucuna ulaşılmıştır. Ayrıca gelir, tasarruf ve hane halkı büyüklüğü gibi bazı temel değişkenler tüm modeller için hanehalkı tüketim harcamalarını artırmaktadır. Bu değişkenlere ek olarak odanın yapısı, mutfak, banyo zeminleri, ısıtma, klima tercihleri, kullanılan enerji kaynakları, müstakil ev, apartman, yazlık, bağ sahipliği ve yatırım tercihleri, kredi kartı kullanımı, internet alışveriş alışkanlıkları gibi çeşitli değişkenler LAD-LASSO modelinde hane halkı tüketim harcamalarının belirleyicileri olarak seçilmiştir. Çalışma sonuçlarından, makine öğrenme algoritmalarının mikroekonometrik modellerin oluşturulması sırasında gerekli değişkenlerin seçiminde kullanılabileceğine dair bulgular elde edilmiştir. Bu çalışma doktora tezinden üretilmiştir.

References

  • Ahrens A.; Hansen, C. B. & Schaffer, M.E.(2019). “Lassopack: Model Selection and Prediction with Regularized Regression in Stata”, IZA Institute of Labor Economics, IZA DP No.12081.
  • Andini, M.; Ciani, E.; De Blasio, G.; D’ignazio, A.& Salvestrini, V. (2018). “Targeting with Machine Learning: An Application to A Tax Rebate Program in Italy”, Journal of Economic Behavior and Organization, 156: 86–102.
  • Arthanari, T. S.& Dodge, Y. (1993). Mathematical Programming in Statistics. John Wiley&Sons, Inc., New York.
  • Azzopardi, D.; Fareed, F.; Lenain, P.& Shutherland, D. (2019). “Assessing Household Financial Vulnerability: Empirical Evidence from the U.S. using Machine Learning”, OECD Economic Survey of the United States: Key Research Findings 2019: 121-142.
  • Birkes, D.& Dodge,Y. (1993). Alternative Methods of Regression. John Wiley&Sons, Inc., New York.
  • Breusch, T. S.& Pagan, A. R. (1979). "A Simple Test for Heteroskedasticity and Random Coefficient Variation", Econometrica, 47(5): 1287-1294.
  • Cook, R. D.& Weisberg, S. (1983). "Diagnostics for Heteroskedasticity in Regression". Biometrika, 70 (1): 1-10.
  • Çalmaşur, G.&Kılıç, A. (2018). “Türkiye’de Hanehalkı Tüketim Harcamalarının Analizi”, ETÜ Sosyal Bilimler Enstitüsü Dergisi, 5 : 61-73.
  • Dodge, Y. (1997). “LAD Regression for Detecting Outliers inResponse and Explanatory Variables”, Journal of Multivariate Analysis, 61: 144-158.
  • Gaffney, R.&Kirkby, R. (2018). “Machine Learning the Consumption Function”, EEA-ESEM Cologne 2018 Conference. https://editorialexpress.com/conference/EEAESEM2018/program/EEAESEM2018 (Erişim Tarihi: 15.07.2020).
  • Hampel, F. R.; Ronchetti, E. M.; Rousseeuw, P. J.& Stahel, W. A. (2005). Robust Statistics: The Approach Based on Influence Functions. John Wiley&Sons, Inc., New York.
  • Kolmogorov, A. (1933). "Sulla Determinazione Empirica di una Legge di Distribuzione". G. Ist. Ital. Attuari, 4: 83-91.
  • Mian, A.; Rao, K & Amir, S. (2013). “Household Balance Sheets, Consumption, and the Economic Slump”, The Quarterly Journal of Economics, 148: 1687–1726.
  • Obrizan, M.; Torosyan, K. & Pignatti, R. (2019). “Tobacco Spending in Georgia: Machine Learning Approach”, ICDSIAI 2018: Recent Developments in Data Science and Intelligent Analysis of Information, 103-114.
  • Önder, K.&Turgut, H. (2018). “Examination of the Factors Affecting Household Rental Housing Demand Through Data Mining: The Case of Turkey”, Eskişehir Osmangazi Üniversitesi İİBF Dergisi, 13(2): 227-238.
  • Pedregosa, F. (2016). “Hyperparameter Optimization with Approximate Gradient.” 33rd ICML, New York, 2016,(Editör. M. F. Balcan and K. Q. Weinberger). Proceedings of Machine Learning Research, 48: 737-746.
  • Rao, C.R.(1973). Linear Statistical Inference and its Applications.2. Basım, John Wiley & Sons, Inc., Canada.
  • Sec, R.&Zemcik, P. (2007). "The Impact Of Mortgages, House Prices And Rents On Household Consumption In The Czech Republic", CERGE-EI Discussion Paper, 2007–2185.
  • Selim, S.& Demirkıran, E. (2020) “Türkiye’de Hanehalkı Gıda Harcamalarını Etkileyen Sosyo-Ekonomik Faktörler: Karşılaştırmalı Bir Analiz”, Hacettepe Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi, 38(2): 297-321.
  • Shapiro S. S.& Wilk, M. B. (1965). “An Analysis of Variance Test for Normality (Complete Samples)”, Biometrika, 52(3/4): 591-611.
  • Shi, P.& Tsai, C. L. (2002). “Regression Model Selection a Residual Likelihood Approach”, J. R. Statist. Soc. B, 64: 237-252.
  • Showers, V. E.& Shotick, J. A. (1994). “The Effects of Household Characteristics on Demand for Insurance: A Tobit Analysis”, The Journal of Risk and Insurance, 61(3): 492-502.
  • Smirnov, N. (1948). "Table for Estimating the Goodness of Fit of Empirical Distributions". Annals of Mathematical Statistics, 19 (2): 279-281.
  • Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of the Royal Statistical Society, 58: 267-288.
  • TUİK(2018). Hanehalkı Bütçe İstatistikleri Mikro Veri Seti, 2018, Metaveri, Amaç. İstanbul.
  • Wang, H,; LI, G.& JIANG, G. (2007). “Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso”, Journal of Business & Economic Statistics, 25: 347-355.
  • Varlamova, J.&Larıonova, N. (2015). “Macroeconomic and Demographic Determinants of Household Expenditures in OECD Countries”, Procedia Economics and Finance, 24: 727 -733.
  • Ylvisaker, D. (1977). “Test Resistance”, Journal of the American Statistical Association, 72(359): 551-556.

Microeconometric Analysis of Household Consumption Expenditures: LAD- LASSO Method

Year 2020, Issue: 33, 13 - 31, 15.01.2021
https://doi.org/10.26650/ekoist.2020.33.843564

Abstract

This study examined how supervised machine learning methods help us select the relevant variables of a Household Budget Survey Consumption Expenditures dataset with outliers in order to achieve better performance in the predicting and forecasting of the Household Consumption Expenditures Model. To achieve this, the Household Budget Survey Consumption Expenditures dataset of Turkey for 2018 was examined using the Least Absolute Deviation (LAD), Least Absolute Shrinkage and Selection Operator (LASSO) and LAD-LASSO methods. In addition, the classical regression method and the prediction and forecasting performances of the methods were compared. According to the analyzed results,it was concluded that the LAD-LASSO machine learning method, which enables the selection of variables while obtaining robust predictors in the presence of long-tailed errors, was the most successful method in prediction performance and forecasting accuracy. Additionally, several fundamental variables such as income, saving, and household size increase the household consumption expenditures for all models. In addition to these variables, other variables including the structure of a room, the kitchen, bathroom floors, heating, air conditioning preferences, energy sources used, detached house, apartment, cottage, vineyard ownership, investment preferences, credit card usage, and internet shopping habits were selected as determinants of household consumption expendituresin the LAD-LASSO model. From the results of the study, it wasfound that machine learning algorithms can be used in the selection of the most appropriate variablesin the course of the construction of microeconometric models.

References

  • Ahrens A.; Hansen, C. B. & Schaffer, M.E.(2019). “Lassopack: Model Selection and Prediction with Regularized Regression in Stata”, IZA Institute of Labor Economics, IZA DP No.12081.
  • Andini, M.; Ciani, E.; De Blasio, G.; D’ignazio, A.& Salvestrini, V. (2018). “Targeting with Machine Learning: An Application to A Tax Rebate Program in Italy”, Journal of Economic Behavior and Organization, 156: 86–102.
  • Arthanari, T. S.& Dodge, Y. (1993). Mathematical Programming in Statistics. John Wiley&Sons, Inc., New York.
  • Azzopardi, D.; Fareed, F.; Lenain, P.& Shutherland, D. (2019). “Assessing Household Financial Vulnerability: Empirical Evidence from the U.S. using Machine Learning”, OECD Economic Survey of the United States: Key Research Findings 2019: 121-142.
  • Birkes, D.& Dodge,Y. (1993). Alternative Methods of Regression. John Wiley&Sons, Inc., New York.
  • Breusch, T. S.& Pagan, A. R. (1979). "A Simple Test for Heteroskedasticity and Random Coefficient Variation", Econometrica, 47(5): 1287-1294.
  • Cook, R. D.& Weisberg, S. (1983). "Diagnostics for Heteroskedasticity in Regression". Biometrika, 70 (1): 1-10.
  • Çalmaşur, G.&Kılıç, A. (2018). “Türkiye’de Hanehalkı Tüketim Harcamalarının Analizi”, ETÜ Sosyal Bilimler Enstitüsü Dergisi, 5 : 61-73.
  • Dodge, Y. (1997). “LAD Regression for Detecting Outliers inResponse and Explanatory Variables”, Journal of Multivariate Analysis, 61: 144-158.
  • Gaffney, R.&Kirkby, R. (2018). “Machine Learning the Consumption Function”, EEA-ESEM Cologne 2018 Conference. https://editorialexpress.com/conference/EEAESEM2018/program/EEAESEM2018 (Erişim Tarihi: 15.07.2020).
  • Hampel, F. R.; Ronchetti, E. M.; Rousseeuw, P. J.& Stahel, W. A. (2005). Robust Statistics: The Approach Based on Influence Functions. John Wiley&Sons, Inc., New York.
  • Kolmogorov, A. (1933). "Sulla Determinazione Empirica di una Legge di Distribuzione". G. Ist. Ital. Attuari, 4: 83-91.
  • Mian, A.; Rao, K & Amir, S. (2013). “Household Balance Sheets, Consumption, and the Economic Slump”, The Quarterly Journal of Economics, 148: 1687–1726.
  • Obrizan, M.; Torosyan, K. & Pignatti, R. (2019). “Tobacco Spending in Georgia: Machine Learning Approach”, ICDSIAI 2018: Recent Developments in Data Science and Intelligent Analysis of Information, 103-114.
  • Önder, K.&Turgut, H. (2018). “Examination of the Factors Affecting Household Rental Housing Demand Through Data Mining: The Case of Turkey”, Eskişehir Osmangazi Üniversitesi İİBF Dergisi, 13(2): 227-238.
  • Pedregosa, F. (2016). “Hyperparameter Optimization with Approximate Gradient.” 33rd ICML, New York, 2016,(Editör. M. F. Balcan and K. Q. Weinberger). Proceedings of Machine Learning Research, 48: 737-746.
  • Rao, C.R.(1973). Linear Statistical Inference and its Applications.2. Basım, John Wiley & Sons, Inc., Canada.
  • Sec, R.&Zemcik, P. (2007). "The Impact Of Mortgages, House Prices And Rents On Household Consumption In The Czech Republic", CERGE-EI Discussion Paper, 2007–2185.
  • Selim, S.& Demirkıran, E. (2020) “Türkiye’de Hanehalkı Gıda Harcamalarını Etkileyen Sosyo-Ekonomik Faktörler: Karşılaştırmalı Bir Analiz”, Hacettepe Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi, 38(2): 297-321.
  • Shapiro S. S.& Wilk, M. B. (1965). “An Analysis of Variance Test for Normality (Complete Samples)”, Biometrika, 52(3/4): 591-611.
  • Shi, P.& Tsai, C. L. (2002). “Regression Model Selection a Residual Likelihood Approach”, J. R. Statist. Soc. B, 64: 237-252.
  • Showers, V. E.& Shotick, J. A. (1994). “The Effects of Household Characteristics on Demand for Insurance: A Tobit Analysis”, The Journal of Risk and Insurance, 61(3): 492-502.
  • Smirnov, N. (1948). "Table for Estimating the Goodness of Fit of Empirical Distributions". Annals of Mathematical Statistics, 19 (2): 279-281.
  • Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso”, Journal of the Royal Statistical Society, 58: 267-288.
  • TUİK(2018). Hanehalkı Bütçe İstatistikleri Mikro Veri Seti, 2018, Metaveri, Amaç. İstanbul.
  • Wang, H,; LI, G.& JIANG, G. (2007). “Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso”, Journal of Business & Economic Statistics, 25: 347-355.
  • Varlamova, J.&Larıonova, N. (2015). “Macroeconomic and Demographic Determinants of Household Expenditures in OECD Countries”, Procedia Economics and Finance, 24: 727 -733.
  • Ylvisaker, D. (1977). “Test Resistance”, Journal of the American Statistical Association, 72(359): 551-556.
There are 28 citations in total.

Details

Primary Language Turkish
Journal Section Articles
Authors

Kadriye Hilal Topal 0000-0001-5203-8017

Ebru Çağlayan Akay 0000-0002-9998-5334

Publication Date January 15, 2021
Submission Date December 19, 2020
Published in Issue Year 2020 Issue: 33

Cite

APA Topal, K. H., & Çağlayan Akay, E. (2021). Hanehalkı Tüketim Harcamalarının Mikroekonometrik Analizi: LAD-LASSO Yöntemi. EKOIST Journal of Econometrics and Statistics(33), 13-31. https://doi.org/10.26650/ekoist.2020.33.843564