Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study

Mehmet Tahir Huyut

doi:10.21597/jist.1846196

Research Article

Normallik Testinde Örneklem Büyüklüğünün Yeterliliğine İlişkin Ampirik ve Asimptotik Perspektifler: Bir Monte Carlo Çalışması

Year 2026, Volume: 16 Issue: 1, 127 - 140, 01.03.2026

Mehmet Tahir Huyut

https://doi.org/10.21597/jist.1846196

https://izlik.org/JA98DP98BX

Abstract

Normallik testleri istatistiksel uygulamalarda yaygın olarak kullanılmaktadır, ancak davranışları genellikle sonlu örneklerde güvenilir olmayabilecek asimptotik kritik değerler kullanılarak değerlendirilmektedir. Özellikle, örneklem büyüklüğünün kritik değer kalibrasyonu ve dağılımsal yapı ile nasıl etkileşime girerek ampirik gücü şekillendirdiği yeterince anlaşılmamıştır. Ampirik ve asimptotik kritik değerlerin örneklem büyüklüğü ile nasıl evrimleştiğini ve bu evrimin simetrik ve asimetrik normallikten sapmalar altında tek değişkenli normallik testlerinin ampirik gücünü nasıl yönettiğini araştırmak amacıyla, yaygın olarak kullanılan on altı normallik testi için büyük ölçekli bir Monte Carlo simülasyon çalışması yapılmıştır. Ampirik ve asimptotik kritik değerler, asimptotik referans değerle birlikte n=25, 50, 100 ve 500 örneklem büyüklüklerinde değerlendirilmiştir. Ampirik güç, α=0,05 ve α=0,10 anlamlılık seviyelerinde değerlendirilmiş ve sonuçlar, yapısal olarak benzer simetrik ve asimetrik alternatif dağılımlar üzerinden ortalama alınarak özetlenmiştir. Küçük ve orta örneklem boyutlarında, çeşitli testler için ampirik ve asimptotik kritik değerler arasında önemli farklılıklar gözlemlendi. Bu farklılıklar doğrudan heterojen güç davranışına dönüştü. Simetrik alternatiflerde, birçok test orta örneklem boyutlarına kadar hızlı güç artışı gösterdi, ardından belirgin bir doygunluk yaşandı. Buna karşılık, asimetrik alternatifler gecikmeli güç birikimi gösterdi ve anlamlı kazanımlar daha büyük örneklem boyutlarında da devam etti. Anlamlılık düzeyinin artırılması gücü eşit şekilde artırdı ancak testlerin göreceli sıralamasını değiştirmedi. Normallik testinde örneklem boyutu etkileri dağılıma büyük ölçüde bağlıdır ve yalnızca asimptotik teori ile yeterince yakalanamaz. Simetrik sapmaları tespit etmek için orta örneklemler yeterli olabilirken, asimetrik sapmalar güvenilir güç elde etmek için daha büyük örneklemler gerektirir. Bu bulgular, normallik testinde sonlu örneklem değerlendirmelerinin önemini vurgulamakta ve daha bilinçli test seçimi için mekanik bir temel sağlamaktadır.

Keywords

Normallik testleri , Sonlu örneklem etkileri , Ampirik ve asimptotik kritik değerler , Örneklem boyutu yeterliliği , Güç doygunluğu

References

Anderson, T. W., & Darling, D. A. (1952). Asymptotic theory of certain “goodness-of-fit” criteria based on stochastic processes. The Annals of Mathematical Statistics, 23(2), 193–212.
Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). New York, NY: Wiley.
Cox, D. R., & Hinkley, D. V. (1974). Theoretical statistics. London, England: Chapman & Hall.
Cramér, H. (1946). Mathematical methods of statistics. Princeton, NJ: Princeton University Press.
D’Agostino, R. B., & Stephens, M. A. (1986). Goodness-of-fit techniques. New York, NY: Marcel Dekker.
Darling, D. A. (1957). The Kolmogorov–Smirnov, Cramér–von Mises tests. The Annals of Mathematical Statistics, 28(4), 823–838.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge, England: Cambridge University Press.
Epps, T. W., & Pulley, L. B. (1983). A test for normality based on the empirical characteristic function. Biometrika, 70(3), 723–726.
Filliben, J. J. (1975). The probability plot correlation coefficient test for normality. Technometrics, 17(1), 111–117.
Glen, A. G., & Leemis, L. M. (2004). Computational and graphical tools for analyzing probability distributions. Computational Statistics & Data Analysis, 46(2), 295–312.
Hosking, J. R. M. (1990). L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society: Series B, 52(1), 105–124.
Jarque, C. M., & Bera, A. K. (1987). A test for normality of observations and regression residuals. International Statistical Review, 55(2), 163–172.
Kendall, M. G., & Stuart, A. (1977). The advanced theory of statistics (Vol. 1). London, England: Griffin.
Lawless, J. F. (2003). Statistical models and methods for lifetime data (2nd ed.). Hoboken, NJ: Wiley.
Lehmann, E. L., & Romano, J. P. (2005). Testing statistical hypotheses (3rd ed.). New York, NY: Springer.
Lilliefors, H. W. (1967). On the Kolmogorov–Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402.
Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
Romão, X., Delgado, R., & Costa, A. (2010). An empirical power comparison of univariate goodness-of-fit tests for normality. Journal of Statistical Computation and Simulation, 80(5), 545–591.
Royston, P. (1982). An extension of Shapiro and Wilk’s W test for normality to large samples. Applied Statistics, 31(2), 115–124.
Serfling, R. J. (1980). Approximation theorems of mathematical statistics. New York, NY: Wiley.
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3), 591–611.
Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69(347), 730–737.
Stephens, M. A. (1976). Asymptotic results for goodness-of-fit statistics with unknown parameters. The Annals of Statistics, 4(2), 357–369.
Stephens, M. A. (1987). Goodness-of-fit techniques. New York, NY: Marcel Dekker.
Thode, H. C. (2002). Testing for normality. New York, NY: Marcel Dekker.
Wilcox, R. R. (2017). Introduction to robust estimation and hypothesis testing (4th ed.). San Diego, CA: Academic Press.
Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141–2155.
Zhang, J., & Wu, Y. (2005). Likelihood-ratio tests for normality. Computational Statistics & Data Analysis, 49(3), 709–721.

Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study

Year 2026, Volume: 16 Issue: 1, 127 - 140, 01.03.2026

Mehmet Tahir Huyut

https://doi.org/10.21597/jist.1846196

https://izlik.org/JA98DP98BX

Abstract

Normality tests are widely used in statistical practice; however, their finite-sample behavior—shaped by the interaction between sample size, critical value calibration, and distributional structure—remains insufficiently understood. This study investigates how sample size governs the reliability of empirical and asymptotic critical values and, in turn, shapes the empirical power of widely used normality tests under symmetric and asymmetric departures from normality. A large-scale Monte Carlo simulation study was conducted for sixteen widely used normality tests. Empirical and asymptotic critical values were evaluated across sample sizes n=25, 50, 100 and 500, together with the asymptotic benchmark. Empirical power was assessed at significance levels α=0.05 and α=0.10, with results summarized by averaging across structurally similar symmetric and asymmetric alternative distributions. Substantial discrepancies between empirical and asymptotic critical values were observed for several tests at small and moderate sample sizes. These discrepancies translated directly into heterogeneous power behavior. Under symmetric alternatives, many tests exhibited rapid power gains up to moderate sample sizes, followed by clear saturation. In contrast, asymmetric alternatives showed delayed power accumulation, with meaningful gains persisting at larger sample sizes. Increasing the significance level increased power uniformly but did not alter relative test rankings. Sample size effects in normality testing are strongly distribution-dependent and cannot be adequately captured by asymptotic theory alone. Moderate samples may suffice for detecting symmetric deviations, whereas asymmetric departures require larger samples to achieve reliable power. These findings underscore the importance of finite-sample considerations in normality testing and provide a mechanistic basis for more informed test selection.

Keywords

Normality tests , Finite-sample effects , Empirical and asymptotic critical values , Sample size adequacy , Monte Carlo simulation

Ethical Statement

Ethics approval was not required for this study as it involves only simulation-based analyses using synthetic data generated under predefined statistical models

References

Anderson, T. W., & Darling, D. A. (1952). Asymptotic theory of certain “goodness-of-fit” criteria based on stochastic processes. The Annals of Mathematical Statistics, 23(2), 193–212.
Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). New York, NY: Wiley.
Cox, D. R., & Hinkley, D. V. (1974). Theoretical statistics. London, England: Chapman & Hall.
Cramér, H. (1946). Mathematical methods of statistics. Princeton, NJ: Princeton University Press.
D’Agostino, R. B., & Stephens, M. A. (1986). Goodness-of-fit techniques. New York, NY: Marcel Dekker.
Darling, D. A. (1957). The Kolmogorov–Smirnov, Cramér–von Mises tests. The Annals of Mathematical Statistics, 28(4), 823–838.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge, England: Cambridge University Press.
Epps, T. W., & Pulley, L. B. (1983). A test for normality based on the empirical characteristic function. Biometrika, 70(3), 723–726.
Filliben, J. J. (1975). The probability plot correlation coefficient test for normality. Technometrics, 17(1), 111–117.
Glen, A. G., & Leemis, L. M. (2004). Computational and graphical tools for analyzing probability distributions. Computational Statistics & Data Analysis, 46(2), 295–312.
Hosking, J. R. M. (1990). L-moments: Analysis and estimation of distributions using linear combinations of order statistics. Journal of the Royal Statistical Society: Series B, 52(1), 105–124.
Jarque, C. M., & Bera, A. K. (1987). A test for normality of observations and regression residuals. International Statistical Review, 55(2), 163–172.
Kendall, M. G., & Stuart, A. (1977). The advanced theory of statistics (Vol. 1). London, England: Griffin.
Lawless, J. F. (2003). Statistical models and methods for lifetime data (2nd ed.). Hoboken, NJ: Wiley.
Lehmann, E. L., & Romano, J. P. (2005). Testing statistical hypotheses (3rd ed.). New York, NY: Springer.
Lilliefors, H. W. (1967). On the Kolmogorov–Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402.
Razali, N. M., & Wah, Y. B. (2011). Power comparisons of Shapiro–Wilk, Kolmogorov–Smirnov, Lilliefors and Anderson–Darling tests. Journal of Statistical Modeling and Analytics, 2(1), 21–33.
Romão, X., Delgado, R., & Costa, A. (2010). An empirical power comparison of univariate goodness-of-fit tests for normality. Journal of Statistical Computation and Simulation, 80(5), 545–591.
Royston, P. (1982). An extension of Shapiro and Wilk’s W test for normality to large samples. Applied Statistics, 31(2), 115–124.
Serfling, R. J. (1980). Approximation theorems of mathematical statistics. New York, NY: Wiley.
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3), 591–611.
Stephens, M. A. (1974). EDF statistics for goodness of fit and some comparisons. Journal of the American Statistical Association, 69(347), 730–737.
Stephens, M. A. (1976). Asymptotic results for goodness-of-fit statistics with unknown parameters. The Annals of Statistics, 4(2), 357–369.
Stephens, M. A. (1987). Goodness-of-fit techniques. New York, NY: Marcel Dekker.
Thode, H. C. (2002). Testing for normality. New York, NY: Marcel Dekker.
Wilcox, R. R. (2017). Introduction to robust estimation and hypothesis testing (4th ed.). San Diego, CA: Academic Press.
Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141–2155.
Zhang, J., & Wu, Y. (2005). Likelihood-ratio tests for normality. Computational Statistics & Data Analysis, 49(3), 709–721.

There are 28 citations in total.

Details

Primary Language	English
Subjects	Bioengineering (Other)
Journal Section	Research Article
Authors	Mehmet Tahir Huyut 0000-0002-2564-991X
Submission Date	December 21, 2025
Acceptance Date	January 15, 2026
Publication Date	March 1, 2026
DOI	https://doi.org/10.21597/jist.1846196
IZ	https://izlik.org/JA98DP98BX
Published in Issue	Year 2026 Volume: 16 Issue: 1

Cite

APA	Huyut, M. T. (2026). Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study. Journal of the Institute of Science and Technology, 16(1), 127-140. https://doi.org/10.21597/jist.1846196
AMA	1.Huyut MT. Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study. J. Inst. Sci. and Tech. 2026;16(1):127-140. doi:10.21597/jist.1846196
Chicago	Huyut, Mehmet Tahir. 2026. “Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study”. Journal of the Institute of Science and Technology 16 (1): 127-40. https://doi.org/10.21597/jist.1846196.
EndNote	Huyut MT (March 1, 2026) Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study. Journal of the Institute of Science and Technology 16 1 127–140.
IEEE	[1]M. T. Huyut, “Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study”, J. Inst. Sci. and Tech., vol. 16, no. 1, pp. 127–140, Mar. 2026, doi: 10.21597/jist.1846196.
ISNAD	Huyut, Mehmet Tahir. “Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study”. Journal of the Institute of Science and Technology 16/1 (March 1, 2026): 127-140. https://doi.org/10.21597/jist.1846196.
JAMA	1.Huyut MT. Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study. J. Inst. Sci. and Tech. 2026;16:127–140.
MLA	Huyut, Mehmet Tahir. “Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study”. Journal of the Institute of Science and Technology, vol. 16, no. 1, Mar. 2026, pp. 127-40, doi:10.21597/jist.1846196.
Vancouver	1.Mehmet Tahir Huyut. Empirical and Asymptotic Perspectives on Sample Size Adequacy in Normality Testing: A Monte Carlo Study. J. Inst. Sci. and Tech. 2026 Mar. 1;16(1):127-40. doi:10.21597/jist.1846196

Article Files

Full Text