EN
TR
A Comparative Evaluation of the Outlier Detection Methods
Abstract
In data mining, in order to calculate descriptive statistics and other statistical model parameters correctly, outliers should be identified and excluded from the data set before starting data analysis. This paper studied and compared the performance of model-based, density-based, clustering-based, angle-based, and isolation-based outlier detection methods used in data mining. ROC and AUC curves were used to compare the performances of outlier detection methods. A data set with a standard normal distribution and fit a logistic regression was simulated. To compare the methods, the data was modified by randomly adding 30 outliers to the data set. The iForest algorithm was found to have higher predictive power than Mahalanobis, LOF, k-means, and ABOD. In addition, outliers were found in a real data set with the iForest algorithm and deleted from the data set. Then, the data sets with outliers and without outliers were compared. The results showed that the model without outliers has a higher predictive ability.
Keywords
Supporting Institution
Cukurova University
Project Number
FDK-2018 10287
Ethical Statement
Ethical Consideration
Ethics committee approval was not required for this study because of there was no study on animals or humans. The authors confirm that the ethical policies of the journal, as noted on the journal's author guidelines page, have been adhered to.
Thanks
We gratefully thank to Prof. Dr. Zeynel CEBECİ at the Cukurova University for his contributions in this study.
We would like to thank Cukurova University Scientific Research Coordinatorship for supporting this study with project number FDK-2018 10287.
It was produced from the thesis titled “Comparative Examination of Outlier Detection Methods in Binary Logistics Regression Analysis” at Cukurova University Thesis no: 794371. https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp.
References
- Auslander B, Gupta KM, Aha DW. 2011. A comparative evaluation of anomaly detection algorithms for maritime video surveillance. Proceedings of the Society of Photographic Instrumentation Engineers Conference, June 15-17, Orlando, US, Vol. 8019, pp: 27-40.
- Bharadiya JP. 2023. A comparative study of business intelligence and artificial intelligence with big data analytics. American J Artific Intel, 7(1): 24-30.
- Ben-Gal I. 2005. Outlier detection. In Data Mining and Knowledge Discovery Handbook, Springer, Boston, US, pp: 288.
- Bertizlioglu IN, Ozgonenel O. 2012. Blackout detection using k-means clustering method. ELECO'2012 Electrical and Electronics Engineering Symposium, November 29-December 1, Bursa, Turkiye.
- Breunig MM, Kriegel HP, Ng RT, Sander J. 2000. LOF: Identifying Density-Based Local Outliers. In ACM Sigmod Record, 29(2): 93-104.
- Cebeci Z. 2020. Data preprocessing with R in data science. Nobel Academic Publishing, Ankara, Türkiye, opp: 552.
- Cebeci Z, Cebeci C, Tahtali Y, Bayyurt L. 2022. Two novel outlier detection approaches based on unsupervised possibilistic and fuzzy clustering. PeerJ Comp Sci, 8: e1060.
- Deb AB, Dey L. 2017. Outlier detection and removal algorithm in k-means and hierarchical clustering. World J Comp Appl Technol, 5(2): 24-29.
Details
Primary Language
English
Subjects
Agricultural Engineering (Other)
Journal Section
Research Article
Authors
Early Pub Date
February 1, 2024
Publication Date
March 15, 2024
Submission Date
November 22, 2023
Acceptance Date
January 8, 2024
Published in Issue
Year 2024 Volume: 7 Number: 2
APA
Çelik Güney, M., & Kayaalp, G. T. (2024). A Comparative Evaluation of the Outlier Detection Methods. Black Sea Journal of Engineering and Science, 7(2), 155-159. https://doi.org/10.34248/bsengineering.1387431
AMA
1.Çelik Güney M, Kayaalp GT. A Comparative Evaluation of the Outlier Detection Methods. BSJ Eng. Sci. 2024;7(2):155-159. doi:10.34248/bsengineering.1387431
Chicago
Çelik Güney, Melis, and Gökhan Tamer Kayaalp. 2024. “A Comparative Evaluation of the Outlier Detection Methods”. Black Sea Journal of Engineering and Science 7 (2): 155-59. https://doi.org/10.34248/bsengineering.1387431.
EndNote
Çelik Güney M, Kayaalp GT (March 1, 2024) A Comparative Evaluation of the Outlier Detection Methods. Black Sea Journal of Engineering and Science 7 2 155–159.
IEEE
[1]M. Çelik Güney and G. T. Kayaalp, “A Comparative Evaluation of the Outlier Detection Methods”, BSJ Eng. Sci., vol. 7, no. 2, pp. 155–159, Mar. 2024, doi: 10.34248/bsengineering.1387431.
ISNAD
Çelik Güney, Melis - Kayaalp, Gökhan Tamer. “A Comparative Evaluation of the Outlier Detection Methods”. Black Sea Journal of Engineering and Science 7/2 (March 1, 2024): 155-159. https://doi.org/10.34248/bsengineering.1387431.
JAMA
1.Çelik Güney M, Kayaalp GT. A Comparative Evaluation of the Outlier Detection Methods. BSJ Eng. Sci. 2024;7:155–159.
MLA
Çelik Güney, Melis, and Gökhan Tamer Kayaalp. “A Comparative Evaluation of the Outlier Detection Methods”. Black Sea Journal of Engineering and Science, vol. 7, no. 2, Mar. 2024, pp. 155-9, doi:10.34248/bsengineering.1387431.
Vancouver
1.Melis Çelik Güney, Gökhan Tamer Kayaalp. A Comparative Evaluation of the Outlier Detection Methods. BSJ Eng. Sci. 2024 Mar. 1;7(2):155-9. doi:10.34248/bsengineering.1387431