Research Article

A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression

Number: 10 December 31, 2024
EN TR

A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression

Abstract

The main motivation of this study is to develop an efficient algorithm for diagnosing and detecting outliers in linear regression up to a reasonable level of contamination. The algorithm initially obtains a robust version of the hat matrix at the linear algebra level. The basic subset obtained in the first stage is improved through concentration steps as defined in the fast-LTS (Least Trimmed Squares) regression algorithm. The method can be plugged into another algorithm as a basic subset selection state. The algorithm is effective against outliers in both X and Y directions by a rate of 25%. The complexity of the algorithm increases linearly with the number of observations and parameters. The algorithm is quite fast as it does not require iterative calculations. The success of the algorithm against a specific contamination level is demonstrated through simulations.

Keywords

References

  1. [1] X. Gao and Y. Feng, “Penalized weighted least absolute deviation regression,” Statistics and its interface, vol. 11, no. 1, pp. 79–89, 2018.
  2. [2] P. J. Rousseeuw and K. Van Driessen, “Computing LTS regression for large data sets,” Data mining and knowledge discovery, vol. 12, pp. 29–45, 2006.
  3. [3] D. C. Hoaglin and R. E. Welsch, “The hat matrix in regression and anova,” The American Statistician, vol. 32, no. 1, pp. 17–22, 1978.
  4. [4] J. W. Tukey et al., Exploratory data analysis. Reading, MA, 1977, vol. 2.
  5. [5] A. S. Hadi and J. S. Simonoff, “Procedures for the identification of multiple outliers in linear models,” Journal of the American statistical association, vol. 88, no. 424, pp. 1264–1272, 1993
  6. [6] N. Billor, A. S. Hadi, and P. F. Velleman, “Bacon: blocked adaptive computationally efficient outlier nominators,” Computational statistics & data analysis, vol. 34, no. 3, pp. 279–298, 2000.
  7. [7] N. Billor, S. Chatterjee, and A. S. Hadi, “A re-weighted least squares method for robust regression estimation,” American journal of mathematical and management sciences, vol. 26, no. 3-4, pp. 229–252, 2006.
  8. [8] D. A. Belsley, E. Kuh, and R. E. Welsch, Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons, 2005.

Details

Primary Language

English

Subjects

Econometric and Statistical Methods

Journal Section

Research Article

Early Pub Date

December 24, 2024

Publication Date

December 31, 2024

Submission Date

July 8, 2024

Acceptance Date

December 16, 2024

Published in Issue

Year 2024 Number: 10

APA
Satman, M. H. (2024). A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. Journal of Statistics and Applied Sciences, 10, 76-85. https://doi.org/10.52693/jsas.1512794
AMA
1.Satman MH. A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. JSAS. 2024;(10):76-85. doi:10.52693/jsas.1512794
Chicago
Satman, Mehmet Hakan. 2024. “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”. Journal of Statistics and Applied Sciences, nos. 10: 76-85. https://doi.org/10.52693/jsas.1512794.
EndNote
Satman MH (December 1, 2024) A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. Journal of Statistics and Applied Sciences 10 76–85.
IEEE
[1]M. H. Satman, “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”, JSAS, no. 10, pp. 76–85, Dec. 2024, doi: 10.52693/jsas.1512794.
ISNAD
Satman, Mehmet Hakan. “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”. Journal of Statistics and Applied Sciences. 10 (December 1, 2024): 76-85. https://doi.org/10.52693/jsas.1512794.
JAMA
1.Satman MH. A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. JSAS. 2024;:76–85.
MLA
Satman, Mehmet Hakan. “A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression”. Journal of Statistics and Applied Sciences, no. 10, Dec. 2024, pp. 76-85, doi:10.52693/jsas.1512794.
Vancouver
1.Mehmet Hakan Satman. A Robust Initial Basic Subset Selection Method for Outlier Detection Algorithms in Linear Regression. JSAS. 2024 Dec. 1;(10):76-85. doi:10.52693/jsas.1512794