A Study on Determination of Outlier Observations by Using Chi-Square Threshold Value
Abstract
Outlier
observations are observations that are out of the tendency of all observations
in a data set. The observations come out in situations such as faulty
observation, incorrect data entry. It is important to be able to identify these
observations as the results of statistical analysis, for example such as
multiple regression analysis, can be quite sensitive against to these
observations. Outlier observations are mostly determined by using distance
calculation, statistical test and density based approaches. In this study, the
distances of each observation vector to the center were calculated with
Mahalanobis distance by using R program. For this purpose, the features such as
hematokrit (htc), hemoglobin (hgb), mean platelet volume (mpv), platelet
distribution width (pdw), nonbacterial prostatitis (nbp) and pulse pressure
values measured in the blood of 315 heart patients were examined as data set.
As a result of the research, sixteen observations were found as outlier
observation. It is thought that the result of this study will help the
researchers trying to find out especially the outlier observations.
Keywords
References
- Calenge C, Darmon G, Basille M, Loison A, Jullien JM. 2008. The factorial decomposition of the Mahalanobis distances in habitat selection studies. Ecology, 89(2): 555–566, doi: 10.1890/06-1750.1.
- Egan WJ, Morgan SL. 1998. Outlier detection in multivariate analytical chemical data. Anal Chem, 70(11):2372–2379, doi: 10.1021/ac970763d.
- Farber O, Kadmon R. 2003. Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance. ECMOD, 160 (1-2):115-130, doi: 10.1016/S0304-3800(02)00327-7.
- Gogoi P, Bhattacharyya DK, Borah B, Kalita JK. 2011. A survey of outlier detection methods in network anomaly identification. Computer Journal, 54(4):570-588, doi: 10.1093/comjnl/bxr026.
- Gupta M, Gao J, Aggarwal C, Han J.2013. Outlier detection for temporal data : A survey. IEEE TKDE, 26(9): 2250-2267,doi: 10.1109/TKDE.2013.184.
- Hodge VJ,Austin J. 2004. A survey of outlier detection methodologies. Artif Intell Rev, 22:85–126, doi: 10.1023/B:AIRE.0000045502.10941.a9.
- Hubert M, Van Der Veeken S. 2008. Outlier detection for skewed data. In Journal of Chemometrics, 22(3-4):235-246, doi: 10.1002/cem.1123.
- Liu H, Shah S, Jiang W. 2004. On-line outlier detection and data cleaning. CCEND, 28(9):1635-1647,doi: 10.1016/j.compchemeng.2004.01.009.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Fahrettin Kaya
*
0000-0003-1666-4859
Türkiye
Esra Yavuz
Türkiye
Şeyma Koç
Türkiye
Ömer Faruk Karaokur
Türkiye
Publication Date
January 1, 2019
Submission Date
October 13, 2018
Acceptance Date
November 18, 2018
Published in Issue
Year 2019 Volume: 2 Number: 1