Research Article
BibTex RIS Cite

A Comparison of Imputation Methods In CPI Calculations Used by IWGPS Organizations and Imputation Methods of robust cellwise outlier and missing data

Year 2023, Volume: 5 Issue: 2, 196 - 228, 31.12.2023
https://doi.org/10.51541/nicel.1307183

Abstract

In this study, imputation methods used by IWGPS (*) organizations in CPI (Consumer Price Index) calculations in case of missing data are discussed. Depending on the development of technological devices, methods suitable for the demand of collecting data and producing statistics from the field immediately, in a way that can be adapted to the data collection tools of statistics offices have been proposed. While the immediate imputation advantages of the proposed methods are mentioned, the proposed imputation results are compared with the method results used in the current practice and imputation results of cellwise outlier and missing data in the statistical computer programming language. The method3(i_m üd19) proposed to assist the imputation tools used in CPI calculation all over the world and produced from the statistics is intended to provide convenience to all users. It can also be considered as a common weighted imputation method for both cellwise outlier and missing data case.

References

  • Acuna, E. and Rodriguez, C. (2004), The treatment of missing values and its effect in the classifier accuracy. In D. Banks, L. House, F.R. Mc Morris, P. Arabie, W. Gaul (Eds). Classification, Clustering and Data Mining Applications. Springer-Verlag, 639-648, Berlin Heidelberg.
  • Allison, P. D. (2001), Missing Data. Sage Publications, Inc, (Quantitative Applications in the Social Sciences), 104, Pennsylvania, USA.
  • Anonymous (2020), Consumer price index manual concepts and methods. Identifiers: ISBN 978-1-51354-298-0. https://www.ilo.org› publication › wcms_761444. Erişim Tarihi: 15.01.2023.
  • Anonymous (2023), Unit of measure: Monthly rate of change. https://ec.europa.eu/eurostat/databrowser/view/PRC_IPC_G20__custom_4733163/default/table?lang=en Data extracted on 31/01/2023 13:34:36 from [ESTAT] G20 CPI all-items - Group of Twenty - Consumer price index [PRC_IPC_G20__custom_4733163]. Erişim Tarihi: 31.01.2023.
  • Barnett, V. and Lewis, T. (1978), Outliers in statistical data. John Wiley and Sons, 376, New York.
  • Beckman, R. J. and Cook, R. D. (1983), Outlier……….s. Technometrics, 25 (2), 119-149.
  • Bernoulli, D. (1977)The most probable choice between several discrepant observations and the formation there from of the most likely ınduction, Reprinted in Biometrika, 48, 1-18 (1961, translated by C. G. Allen), London.
  • Chauvenet, W (1863), Method of least squares, appendix to manual of spherical and practical astronomy. Vol 2. Lippincott, 469-566, 539-599, Philadelphia. Reprinted (1960) -5th edn. Dover, New York.
  • Çil, B. (1990) Regresyon analizinde tek bir sapan değerin “outlier’ın” belirlenmesine ilişkin metodların mukayesesi. Doktora Tezi, Ankara Üniversitesi Fen Bilimleri Enstitüsü, Ankara, Turkey.
  • Glaisher, J. W. L. (1872-73). On the rejection of discordant observations, Monthly Notices of the Royal Astronomical Society. 33, 391-402.
  • İnal, C. and Günay, S. (1993), Olasılık ve matematiksel istatistik, Hacettepe Üniversitesi Fen Fakültesi Beytepe Basımevi, 339- 349, Ankara.
  • Hu, M. and Salvucci, S. (2001), A study of imputation algorithms, 122. Working Paper No. 2001-17, Washington, DC.
  • Huber, P. J. (1964), Robust estimation of a location parameter. Ann. Math. Statist. 35(1), 73-101.
  • Huber, P. J. (1981), Robust Statistics. John Wiley and Sons, 320, New York.
  • Little, R. J. A. and Rubin, D. B. (1987), Statistical analysis with missing data, (1st Ed.), John Wiley&Sons, 291, New York.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical analysis with missing data, (2nd Ed.), John Wiley&Sons, 409, New Jersey.
  • Molnar, F. J., Hutton, B. and Fergusson, D. (2008), Does analysis using "last observation carried forward" introduce bias in dementia research?. Canadian Medical Association Journal, 179 (8), 751–753.
  • Newcomb, S. (1886), A generalized theory of the combination of observations so as to obtain the best result, American Journal of Mathematics, 8 (4), 343-366.
  • Osborne, J. W. (2013), Best practices in data cleaning, Sage Publication, Inc, 275, California.
  • Peng, Liu and Lei, L.A. (2005), A review of missing data treatment methods. Shanghai University of Finance and Economics, 8, Shanghai, P. R. China.
  • Rousseeuw, P. J. and Leroy, A. M. (1987), Robust regression and outlier detection, John Wiley & Sons, Inc, 341, Canada. ISBN 0471-85233-3.
  • Rubin, D. B. (1976), Inference and missing data (with discussion), Biometrika, 63, 581–592.
  • Schafer, J. L. (1999), Multiple imputation: A primer, Statistical Methods on Medical Research, 8(1), 3-15.
  • Stone, E. J. (1868), On the rejection of discordant observations, Monthly Notices of the Royal Astronomical Society, 28, 165- 168.
  • Stone, E. J. (1873), On the rejection of discordant observations, Monthly Notices of the Royal Astronomical Society, 34, 9-15.
  • Student, (1927), Errors of routine analysis, Biometrika, 19, 151–164.
  • Tabachnick, B. and Fidell, L. (1996). Using multivariate statistics (8th ed.). Pearson Education, 1018, USA.
  • Wright, T. W. (1884). A Treatise on the adjustment of observations by the method of least squares. Van Nostrand, 298, New York.

IWGPS Kuruluşlarının Kullandığı Tüfe Hesaplamalarındaki İmputasyon Yöntemleri İle Sağlam Hücresel Aykırı Değer Ve Kayıp Veri İmputasyon Yöntemlerinin Bir Karşılaştırması

Year 2023, Volume: 5 Issue: 2, 196 - 228, 31.12.2023
https://doi.org/10.51541/nicel.1307183

Abstract

Bu çalışmada IWGPS (*) kuruluşları tarafından kayıp veri durumunda TÜFE (Tüketici Fiyat Endeksi) hesaplamalarında kullanılan imputasyon yöntemleri ele alınmaktadır. Teknolojik cihazların gelişime bağlı olarak, istatistik ofislerinin veri derleme araçlarına uyarlanabilecek şekilde, anında alandan veri derleme ve istatistik üretme talebine uygun yöntemler önerilmiştir. Önerilen yöntemlerin anında imputasyon avantajlarından bahsedilmekle birlikte öneri imputasyon sonuçları, mevcut uygulamada kullanılan yöntem sonuçlarıyla ve istatistik paket programındaki hücresel aykırı değer ve kayıp veri imputasyon sonuçları ile karşılaştırılmıştır. Tüm dünyada TÜFE hesaplamasında kullanılan ve istatistiklerden üretilen imputasyon araçlarına yardımcı olmak için önerilen yöntem3(i_m üd19) ün, tüm kullanıcılara kolaylık sağlaması amaçlanmıştır. Hem hücresel aykırı değer hem de kayıp veri durumu için ortak bir ağırlıklı imputasyon yöntemi olarak da düşünülebilir.

References

  • Acuna, E. and Rodriguez, C. (2004), The treatment of missing values and its effect in the classifier accuracy. In D. Banks, L. House, F.R. Mc Morris, P. Arabie, W. Gaul (Eds). Classification, Clustering and Data Mining Applications. Springer-Verlag, 639-648, Berlin Heidelberg.
  • Allison, P. D. (2001), Missing Data. Sage Publications, Inc, (Quantitative Applications in the Social Sciences), 104, Pennsylvania, USA.
  • Anonymous (2020), Consumer price index manual concepts and methods. Identifiers: ISBN 978-1-51354-298-0. https://www.ilo.org› publication › wcms_761444. Erişim Tarihi: 15.01.2023.
  • Anonymous (2023), Unit of measure: Monthly rate of change. https://ec.europa.eu/eurostat/databrowser/view/PRC_IPC_G20__custom_4733163/default/table?lang=en Data extracted on 31/01/2023 13:34:36 from [ESTAT] G20 CPI all-items - Group of Twenty - Consumer price index [PRC_IPC_G20__custom_4733163]. Erişim Tarihi: 31.01.2023.
  • Barnett, V. and Lewis, T. (1978), Outliers in statistical data. John Wiley and Sons, 376, New York.
  • Beckman, R. J. and Cook, R. D. (1983), Outlier……….s. Technometrics, 25 (2), 119-149.
  • Bernoulli, D. (1977)The most probable choice between several discrepant observations and the formation there from of the most likely ınduction, Reprinted in Biometrika, 48, 1-18 (1961, translated by C. G. Allen), London.
  • Chauvenet, W (1863), Method of least squares, appendix to manual of spherical and practical astronomy. Vol 2. Lippincott, 469-566, 539-599, Philadelphia. Reprinted (1960) -5th edn. Dover, New York.
  • Çil, B. (1990) Regresyon analizinde tek bir sapan değerin “outlier’ın” belirlenmesine ilişkin metodların mukayesesi. Doktora Tezi, Ankara Üniversitesi Fen Bilimleri Enstitüsü, Ankara, Turkey.
  • Glaisher, J. W. L. (1872-73). On the rejection of discordant observations, Monthly Notices of the Royal Astronomical Society. 33, 391-402.
  • İnal, C. and Günay, S. (1993), Olasılık ve matematiksel istatistik, Hacettepe Üniversitesi Fen Fakültesi Beytepe Basımevi, 339- 349, Ankara.
  • Hu, M. and Salvucci, S. (2001), A study of imputation algorithms, 122. Working Paper No. 2001-17, Washington, DC.
  • Huber, P. J. (1964), Robust estimation of a location parameter. Ann. Math. Statist. 35(1), 73-101.
  • Huber, P. J. (1981), Robust Statistics. John Wiley and Sons, 320, New York.
  • Little, R. J. A. and Rubin, D. B. (1987), Statistical analysis with missing data, (1st Ed.), John Wiley&Sons, 291, New York.
  • Little, R. J. A. and Rubin, D. B. (2002). Statistical analysis with missing data, (2nd Ed.), John Wiley&Sons, 409, New Jersey.
  • Molnar, F. J., Hutton, B. and Fergusson, D. (2008), Does analysis using "last observation carried forward" introduce bias in dementia research?. Canadian Medical Association Journal, 179 (8), 751–753.
  • Newcomb, S. (1886), A generalized theory of the combination of observations so as to obtain the best result, American Journal of Mathematics, 8 (4), 343-366.
  • Osborne, J. W. (2013), Best practices in data cleaning, Sage Publication, Inc, 275, California.
  • Peng, Liu and Lei, L.A. (2005), A review of missing data treatment methods. Shanghai University of Finance and Economics, 8, Shanghai, P. R. China.
  • Rousseeuw, P. J. and Leroy, A. M. (1987), Robust regression and outlier detection, John Wiley & Sons, Inc, 341, Canada. ISBN 0471-85233-3.
  • Rubin, D. B. (1976), Inference and missing data (with discussion), Biometrika, 63, 581–592.
  • Schafer, J. L. (1999), Multiple imputation: A primer, Statistical Methods on Medical Research, 8(1), 3-15.
  • Stone, E. J. (1868), On the rejection of discordant observations, Monthly Notices of the Royal Astronomical Society, 28, 165- 168.
  • Stone, E. J. (1873), On the rejection of discordant observations, Monthly Notices of the Royal Astronomical Society, 34, 9-15.
  • Student, (1927), Errors of routine analysis, Biometrika, 19, 151–164.
  • Tabachnick, B. and Fidell, L. (1996). Using multivariate statistics (8th ed.). Pearson Education, 1018, USA.
  • Wright, T. W. (1884). A Treatise on the adjustment of observations by the method of least squares. Van Nostrand, 298, New York.
There are 28 citations in total.

Details

Primary Language English
Subjects Statistics
Journal Section Articles
Authors

Elif Şen 0009-0008-0267-9287

Olcay Arslan 0000-0002-7067-4997

Publication Date December 31, 2023
Published in Issue Year 2023 Volume: 5 Issue: 2

Cite

APA Şen, E., & Arslan, O. (2023). A Comparison of Imputation Methods In CPI Calculations Used by IWGPS Organizations and Imputation Methods of robust cellwise outlier and missing data. Nicel Bilimler Dergisi, 5(2), 196-228. https://doi.org/10.51541/nicel.1307183