Araştırma Makalesi

Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria

Cilt: 6 Sayı: 1 15 Temmuz 2008
  • Özlem Gürünlü Alma *
  • Serdar Kurt
  • Aybars Uğur
PDF İndir
EN TR

Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria

Öz

Statistical models, particularly regression models, are most useful devices for extracting and understanding the essential features of datasets. However, most of the databases in real-world include a particular amount of abnormal values, generally termed as outliers. An accurate identification of outliers plays a significant role in statistical analysis especially regression models. Nevertheless, many classical statistical models are blindly applied to data sets containing outliers, the results can be misleading at best. The appearance of outliers can exert negative influences on the fit of the multiple regression models. The aim of this study is to define outlier detection method using Genetic Algorithms (GA) with Bayesian Information Criterion (BIC) and to illustrate the algorithm with real and simulation data. We use a fitness function which is based on BIC in this algorithm. The criteria’s value indicates a better model to fit data, the presence of one or more outliers will negatively impact the regression model and result in larger BIC values.

Anahtar Kelimeler

Kaynakça

  1. Abe, N., Zadronzy, B., and Langford, J., 2006. Outlier detection by active learning. ACM. Proceedings of the 12th ACM SIGKDD International conference on Knowledge Discovery and Data Mining, 767-772, New York, USA.
  2. Acuna, E., and Rodriguez, C., 2005. On detection of outliers and their effect in supervised classification, http://academic.uprm.edu/~eacuna/vene31.pdf, 30 April 2008.
  3. Amidan, B., Ferryman, and T., Cooley S., 2005. Data outlier detection using the Chebyshew theorem. IEEE Aerospace Conference Proceedings, IEEE, Piscataway NJ USA, 3814-3819.
  4. Atkinson, A.C., 1986. Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 1, 397-402.
  5. Barnett, V., and Lewis, T., 1994. Outliers in statistical data. John Wiley and Sons, USA.
  6. Ben-Gal I., 2005. Outlier detection.,131-146. In: Maimon O. and Rokach L., Data mining and knowledge discovery handbook. Springer, USA.
  7. Bozdogan, H., 2004. Statistical data mining and knowledge discovery. Chapman and Hall/CRC, USA.
  8. Breitenbach, M., and Grudic, G.Z., 2005. Clustering through ranking on manifolds. Proceedings of the 22nd International Conference on Machine Learning, 73-80, New York, USA.

Ayrıntılar

Birincil Dil

İngilizce

Konular

İstatistik

Bölüm

Araştırma Makalesi

Yazarlar

Özlem Gürünlü Alma * Bu kişi benim
Türkiye

Serdar Kurt Bu kişi benim
Türkiye

Aybars Uğur Bu kişi benim
Türkiye

Yayımlanma Tarihi

15 Temmuz 2008

Gönderilme Tarihi

4 Ocak 2008

Kabul Tarihi

-

Yayımlandığı Sayı

Yıl 2008 Cilt: 6 Sayı: 1

Kaynak Göster

APA
Gürünlü Alma, Ö., Kurt, S., & Uğur, A. (2008). Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria. İstatistik Araştırma Dergisi, 6(1), 38-51. https://izlik.org/JA82FB63DX
AMA
1.Gürünlü Alma Ö, Kurt S, Uğur A. Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria. JSRTR. 2008;6(1):38-51. https://izlik.org/JA82FB63DX
Chicago
Gürünlü Alma, Özlem, Serdar Kurt, ve Aybars Uğur. 2008. “Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria”. İstatistik Araştırma Dergisi 6 (1): 38-51. https://izlik.org/JA82FB63DX.
EndNote
Gürünlü Alma Ö, Kurt S, Uğur A (01 Temmuz 2008) Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria. İstatistik Araştırma Dergisi 6 1 38–51.
IEEE
[1]Ö. Gürünlü Alma, S. Kurt, ve A. Uğur, “Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria”, JSRTR, c. 6, sy 1, ss. 38–51, Tem. 2008, [çevrimiçi]. Erişim adresi: https://izlik.org/JA82FB63DX
ISNAD
Gürünlü Alma, Özlem - Kurt, Serdar - Uğur, Aybars. “Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria”. İstatistik Araştırma Dergisi 6/1 (01 Temmuz 2008): 38-51. https://izlik.org/JA82FB63DX.
JAMA
1.Gürünlü Alma Ö, Kurt S, Uğur A. Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria. JSRTR. 2008;6:38–51.
MLA
Gürünlü Alma, Özlem, vd. “Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria”. İstatistik Araştırma Dergisi, c. 6, sy 1, Temmuz 2008, ss. 38-51, https://izlik.org/JA82FB63DX.
Vancouver
1.Özlem Gürünlü Alma, Serdar Kurt, Aybars Uğur. Outlier Detection in Multiple Regression Models Using Genetic Algorithms and Bayesian Information Criteria. JSRTR [Internet]. 01 Temmuz 2008;6(1):38-51. Erişim adresi: https://izlik.org/JA82FB63DX