EN
TR
A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression
Abstract
The main motivation of this study is to develop an efficient algorithm for diagnosing and detecting outliers in linear regression up to a reasonable level of contamination. The algorithm initially obtains a robust version of the hat matrix at the linear algebra level. The basic subset obtained in the first stage is improved through concentration steps as defined in the fast-LTS (Least Trimmed Squares) regression algorithm. The method can be plugged into another algorithm as a basic subset selection state. The algorithm is effective against outliers in both X and Y directions by a rate of 25%. The complexity of the algorithm increases linearly with the number of observations and parameters. The algorithm is quite fast as it does not require iterative calculations. The success of the algorithm against a specific contamination level is demonstrated through simulations.
Keywords
References
- [1] X. Gao and Y. Feng, “Penalized weighted least absolute deviation regression,” Statistics and its interface, vol. 11, no. 1, pp. 79–89, 2018.
- [2] P. J. Rousseeuw and K. Van Driessen, “Computing LTS regression for large data sets,” Data mining and knowledge discovery, vol. 12, pp. 29–45, 2006.
- [3] D. C. Hoaglin and R. E. Welsch, “The hat matrix in regression and anova,” The American Statistician, vol. 32, no. 1, pp. 17–22, 1978.
- [4] J. W. Tukey et al., Exploratory data analysis. Reading, MA, 1977, vol. 2.
- [5] A. S. Hadi and J. S. Simonoff, “Procedures for the identification of multiple outliers in linear models,” Journal of the American statistical association, vol. 88, no. 424, pp. 1264–1272, 1993
- [6] N. Billor, A. S. Hadi, and P. F. Velleman, “Bacon: blocked adaptive computationally efficient outlier nominators,” Computational statistics & data analysis, vol. 34, no. 3, pp. 279–298, 2000.
- [7] N. Billor, S. Chatterjee, and A. S. Hadi, “A re-weighted least squares method for robust regression estimation,” American journal of mathematical and management sciences, vol. 26, no. 3-4, pp. 229–252, 2006.
- [8] D. A. Belsley, E. Kuh, and R. E. Welsch, Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons, 2005.
Details
Primary Language
English
Subjects
Econometric and Statistical Methods
Journal Section
Research Article
Authors
Early Pub Date
December 24, 2024
Publication Date
December 31, 2024
Submission Date
July 8, 2024
Acceptance Date
December 16, 2024
Published in Issue
Year 2024 Number: 10
APA
Satman, M. H. (2024). A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. Journal of Statistics and Applied Sciences, 10, 76-85. https://doi.org/10.52693/jsas.1512794
AMA
1.Satman MH. A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. JSAS. 2024;(10):76-85. doi:10.52693/jsas.1512794
Chicago
Satman, Mehmet Hakan. 2024. “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”. Journal of Statistics and Applied Sciences, nos. 10: 76-85. https://doi.org/10.52693/jsas.1512794.
EndNote
Satman MH (December 1, 2024) A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. Journal of Statistics and Applied Sciences 10 76–85.
IEEE
[1]M. H. Satman, “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”, JSAS, no. 10, pp. 76–85, Dec. 2024, doi: 10.52693/jsas.1512794.
ISNAD
Satman, Mehmet Hakan. “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”. Journal of Statistics and Applied Sciences. 10 (December 1, 2024): 76-85. https://doi.org/10.52693/jsas.1512794.
JAMA
1.Satman MH. A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. JSAS. 2024;:76–85.
MLA
Satman, Mehmet Hakan. “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”. Journal of Statistics and Applied Sciences, no. 10, Dec. 2024, pp. 76-85, doi:10.52693/jsas.1512794.
Vancouver
1.Mehmet Hakan Satman. A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. JSAS. 2024 Dec. 1;(10):76-85. doi:10.52693/jsas.1512794