ESTIMATING THE MISSING VALUE IN ONE-WAY ANOVA UNDER LONG-TAILED SYMMETRIC ERROR DISTRIBUTIONS
Year 2018,
Volume: 36 Issue: 2, 523 - 538, 01.06.2018
Demet Aydın
Birdal Şenoğlu
Abstract
In practice, missing values are widely seen and create serious problems in almost all statistical analysis. In this study, to deal with missing values, we propose estimators for missing value in one-way analysis of variance (ANOVA) when the distribution of error terms is long-tailed symmetric (LTS). We use methodologies known as maximum likelihood (ML), modified maximum likelihood (MML) and least squares (LS) in estimating missing value. Expectation and maximization (EM) algorithm is used for computing ML estimate of missing value. We compare the efficiencies of LS, ML and MML estimators of missing value via Monte Carlo simulation study. Simulation results show that ML estimator of missing value is the most efficient among the others. The usefulness of the proposed estimators is illustrated by peak discharge data example taken from civil engineering.
References
- [1] Allan F.G. and Wishart J., (1930) A Method of Estimating the Yield of a Missing Plot in Field Experiments, Journal of Agricultural Science 20, 399-406.
- [2] Anderson R.L., (1946) Missing-plot Techniques, Biometrics Bulletin 2, 41–47.
- [3] Aydin D., (2013) Robust Estimation of Missing Observations in Experimental Design, PhD Thesis, Graduate School of Natural and Applied Sciences, Ankara University, Ankara Turkey (In Turkish).
- [4] Aydin D. and Senoglu B., (2017) A Comparison of Different Methods for Estimating the Missing Value in Two-Way ANOVA, International Journal of Mathematics and Statistics 18, 47–63.
- [5] Bartlett M.S., (1937) Some Examples of Statistical Methods of Research in Agriculture And Applied Biology, Journal of the Royal Statistical Society-Supplement 4, 137–170,
- [6] Bickel P.J. and Doksum K.A., (1981) An Analysis of Transformation Revisited, Journal of American Statistical Association 76, 296–311.
- [7] Brand J.P.L. and Van Buuren S., Groothuis-Oudshoorn K. and Gelsema E.S., (2003) A toolkit in SAS for the evaluation of multiple imputation methods, Statistica Neerlandica 57, 36–45.
- [8] Box G.E.P. and Cox D.R., (1964) An Analysis of Transformations (With Discussion), Journal of the Royal Statistical Society 26, 211–252.
- [9] Cochran W.G. and Cox G.M., (1957) Experimental Designs, John Wiley & Sons Inc., Canada.
- [10] Dempster A.P., Laird N.M. and Rubin D., (1977) Maximum Likelihood Estimation from Incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society 39, 1–38.
- [11] Dodge Y., (1985) Analysis of Experiments with Missing Data, John Wiley & Sons Inc., Canada.
- [12] Elveback L.R., Guillier C.L. and Keating F.R., (1970) Health, Normality and the Ghost of Gauss, Journal of the American Medical Association 211, 69–75.
- [13] Enders C.K., (2010) Applied Missing Data Analysis, Guilford Press, USA.
- [14] Engels J. M. and Diehr P., (2003) Imputation of Missing Longitudinal Data: a Comparison of Methods, Journal of Clinical Epidemiology 56, 968–976.
- [15] Geary R.C., (1947) Testing for Normality, Biometrika 34, 209–242.
- [16] Hartley H.O., (1956) Programming Analysis of Variance for General Purpose Computers, Biometrics 12, 110–122.
- [17] Healy M.J.R. and Westmacott M., (1956) Missing Values in Experiments Analyzed On Automatic Computers, Journal of Applied Statistics 5, 203–206.
- [18] Huber P.J., (1964) Robust Estimation of a Location Parameter, The Annals of Mathematical Statistics 35, 73–101.
- [19] Huber P.J., (1981) Robust Statistics, John Wiley & Sons Inc., New York, USA.
- [20] Jarrett R.G., (1978) The Analysis of Designed Experiments with Missing Observations, Journal of Applied Statistics 27, 38–46.
- [21] Islam M.Q. and Tiku M.L., (2005), Multiple Linear Regression Model Under Non-normality, Communications in Statistics-Theory and Methods 33, 2443–2467.
- [22] Lange, K., Little, R. and Taylor, J., (1989) Robust Statistical Modeling Using the t Distribution, Journal of the American Statistical Association 84, 881–896.
- [23] Little, R.J.A. and Rubin, D.B., (1983) On Jointly Estimating Parameters and Missing Data by Maximizing the Complete-Data Likelihood, American Statistician 37, 218–220.
- [24] Liu C. and Rubin D.B., (1995) ML Estimation of the t Distribution Using EM and its Extensions, ECM and ECME, Statistica Sinica 5, 19–39.
- [25] Mclachlan G.J. and Krishnan T., (2007) The EM Algorithm and Extensions, John Wiley & Sons Inc., New Jersey, USA.
[26] Montgomery D.C., (2005) Design and Analysis of Experiments, John Wiley & Sons Inc., USA.
- [27] Mutan O.C. and Şenoğlu B., (2009) A Monte Carlo Comparison of Regression Estimators when the Error Distribution is Long-Tailed Symmetric, Journal of Modern Applied Statistical Methods 8(1), 126–137.
- [28] Pearson E.S., (1931) The Analysis of Variance in Cases of Non-normal Variation, Biometrika 23, 114–13.
- [29] Peel D. and Mclachlan G.J., (2000) Robust Mixture Modelling Using the t Distribution, Statistics and Computing 10, 339–348.
- [30] Preece D.A., (1971) Iterative Procedures for Missing Values in Experiments, Technometrics 13, 743–753.
- [31] Rao C.R. and Toutenburg H., (1999) Linear Models: Least Square and Alternatives, Springer, New York.
- [32] Reynolds J.F., (1995) Estimating Missing Observations in ANOVA, International Journal of Mathematical Education in Science and Technology 26, 895–901.
- [33] Rubin D.B., (1972) A Non-Iterative Algorithm for Least Squares Estimation of Missing Values in any Analysis of Variance Design, Journal of the Royal Statistical Society 21, 136–141.
- [34] Rubin D.B., (1976) Inference and Missing Data, Biometrika 63, 581–592.
- [35] Senoglu, B. and Avcioglu, M.D., (2009) Analysis of Covariance with Non-normal Errors, International Statistical Review 77, 3, 366–377.
- [36] Şenoğlu B. and Acıtaş Ş., (2016) Robust Factorial ANCOVA with LTS Error Distributions, Hacettepe Journal of Mathematics and Statistics Doi: 10.15672/HJMS.201612918797.
- [37] Subramani J. and Ponnuswamy K.N., (1989) A Non-Iterative Least Squares Estimation of Missing Values in Experimental Designs, Journal of Applied Statistics 16, 77–86.
- [38] Subramani J., (1993) Non-iterative Least Squares Estimation of Missing Values in Hyper-Graeco-Latin Square Designs, Biometrical Journal 35, 465–470.
- [39] Srivastava A.B.L., (1959) Effect of Non-Normality on The Power of the Analysis of Variance Test, Biometrika 46, 114–122.
- [40] Tiku M.L., (1967) Estimating the Mean and Standard Deviation from a Censored Normal Sample, Biometrika 54, 155–165.
- [41] Tiku M.L. and Kumra S., (1981) Expected values and variances and covariances of order statistics for a family of symmetric distributions (Student’s t), Selected Tables in Mathematical Statistics 8, 141–270 American Mathematical Society, Providence RI, USA.
- [42] Tiku M.L., Islam M.Q. and Selcuk S.A., (2001) Non-normal Regression, Part II: Symmetric Distributions, Communications in Statistics - Theory and Methods 30, 1021–1045.
- [43] Tiku M.L. and Akkaya A.D., (2004) Robust Estimation and Hypothesis Testing, New Age International (P) Limited Publishers, New Delhi, India.
- [44] Wilkinson G.N., (1958) Estimation of Missing Values for the Analysis of Incomplete Data, Biometrics 14, 174–194.
- [45] Yates F., (1933) The Analysis of Replicated Experiments When the Field Results are Incomplete, Empire Journal of Experimental Agriculture 1, 129–142.