Araştırma Makalesi
BibTex RIS Kaynak Göster

Denetimli Makine Öğrenmesi Teknikleri ile Anomali Tespiti: Shuttle Uzay Verisi Örneği

Yıl 2025, Sayı: ERKEN GÖRÜNÜM, 1 - 1
https://doi.org/10.17134/khosbd.1772308

Öz

Bu çalışmada, anomali tespiti problemine yönelik Lojistik Regresyon, k-En Yakın Komşu, Karar Ağacı, Rastgele Orman ve XGBoost olmak üzere beş farklı denetimli makine öğrenmesi algoritmasının performansları karşılaştırmalı olarak analiz edilmiştir. Hem gerçek dünya verisi hem de yapay olarak oluşturulan simülasyon verisi kullanılarak modellerin F1 skoru, ROC-AUC değeri, eğitim ve test süreleri gibi ölçütler üzerinden değerlendirmeleri yapılmıştır. Gerçek veri üzerinde yapılan analizde sınıf dengesizliği göz önünde bulundurulmasına rağmen bazı modellerin yüksek başarı sağladığı görülmüştür. Simülasyon verileri ise modellerin anomali yapısını öğrenmedeki başarısını daha nesnel şekilde test etme imkânı sunmuştur. Bulgular, özellikle k-En Yakın Komşu, Karar Ağacı ve Rastgele Orman modellerinin yüksek doğrulukla anomalileri tespit ettiğini göstermektedir. XGBoost modelinin ise bazı durumlarda düşük anomali oranını yeterince ayırt edemediği gözlemlenmiştir. Bu kapsamda, parametre optimizasyonunun model başarısında kritik rol oynadığı vurgulanmıştır. Çalışmanın sonuçları, yüksek güvenlik gerektiren savunma, havacılık ve uzay gibi sektörlerde uygulanabilecek anomali tespit sistemlerine yönelik yol gösterici niteliktedir.

Kaynakça

  • [1] Schwabacher, M., Oza, N., & Matthews, B. (2009). Unsupervised anomaly detection for liquid-fueled rocket propulsion health monitoring. Journal of Aerospace Computing, Information, and Communication, 6(7), 464–473. https://doi.org/10.2514/1.42783
  • [2] Goldstein, Markus and Uchida, Seiichi A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data, April 2016 PLoS ONE 11(4):e0152173 DOI:10.1371/journal.pone.0152173
  • [3] S. Shriram and E. Sivasankar, "Anomaly Detection on Shuttle data using Unsupervised Learning Techniques," 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates, 2019, pp. 221-225, doi: 10.1109/ICCIKE47802.2019.9004325
  • [4] Aly, M., Behiry, M.H. Enhancing anomaly detection in IoT-driven factories using Logistic Boosting, Random Forest, and SVM: A comparative machine learning approach. Sci Rep 15, 23694 (2025). https://doi.org/10.1038/s41598-025-08436-x
  • [5] Volnova, A.A. et al. (2024). Exploring the Universe with SNAD: Anomaly Detection in Astronomy. In: Baixeries, J., Ignatov, D.I., Kuznetsov, S.O., Stupnikov, S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2023. Communications in Computer and Information Science, vol 2086. Springer, Cham. https://doi.org/10.1007/978-3-031-67826-4_15
  • [6] D’Addona, M., Riccio, G., Cavuoti, S., Tortora, C., Brescia, M. (2021). Anomaly Detection in Astrophysics: A Comparison Between Unsupervised Deep and Machine Learning on KiDS Data. In: Zelinka, I., Brescia, M., Baron, D. (eds) Intelligent Astrophysics. Emergence, Complexity and Computation, vol 39. Springer, Cham. https://doi.org/10.1007/978-3-030-65867-0_10
  • [7] Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20(2), 215–242.
  • [8] Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • [9] Quinlan, R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251
  • [10] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • [11] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016), 785–794. https://doi.org/10.1145/2939672.2939785
  • [12] Statlog (Shuttle) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5WS31.
  • [13] Domingues, R., Filippone, M., Michiardi, P., & Zouaoui, J. (2018). A Comparative Evaluation of Outlier Detection Algorithms: Experiments and Analyses. Pattern Recognition, 74, 406–421. https://doi.org/10.1016/j.patcog.2017.09.037

Anomaly Detection with Supervised Machine Learning Techniques: The Case of Shuttle Space Data

Yıl 2025, Sayı: ERKEN GÖRÜNÜM, 1 - 1
https://doi.org/10.17134/khosbd.1772308

Öz

In this study, the performance of five different supervised machine learning algorithms—Logistic Regression, k-Nearest Neighbors, Decision Tree, Random Forest, and XGBoost—was comparatively analyzed for the anomaly detection problem. Both real-world data and artificially generated simulation data were used to evaluate the models based on metrics such as F1-score, ROC-AUC, and training and testing times. Despite considering class imbalance in the real-world dataset, some models achieved high performance. The simulation data provided a more objective means of testing the models’ ability to learn anomaly structures. The findings revealed that, in particular, k-Nearest Neighbors, Decision Tree, and Random Forest models were able to detect anomalies with high accuracy, while the XGBoost model, in some cases, struggled to sufficiently distinguish low anomaly rates. In this context, parameter optimization was emphasized as a critical factor in model performance. The results of the study are of a guiding nature for the development of anomaly detection systems applicable in sectors requiring high security, such as defense, aviation, and space.

Kaynakça

  • [1] Schwabacher, M., Oza, N., & Matthews, B. (2009). Unsupervised anomaly detection for liquid-fueled rocket propulsion health monitoring. Journal of Aerospace Computing, Information, and Communication, 6(7), 464–473. https://doi.org/10.2514/1.42783
  • [2] Goldstein, Markus and Uchida, Seiichi A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data, April 2016 PLoS ONE 11(4):e0152173 DOI:10.1371/journal.pone.0152173
  • [3] S. Shriram and E. Sivasankar, "Anomaly Detection on Shuttle data using Unsupervised Learning Techniques," 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates, 2019, pp. 221-225, doi: 10.1109/ICCIKE47802.2019.9004325
  • [4] Aly, M., Behiry, M.H. Enhancing anomaly detection in IoT-driven factories using Logistic Boosting, Random Forest, and SVM: A comparative machine learning approach. Sci Rep 15, 23694 (2025). https://doi.org/10.1038/s41598-025-08436-x
  • [5] Volnova, A.A. et al. (2024). Exploring the Universe with SNAD: Anomaly Detection in Astronomy. In: Baixeries, J., Ignatov, D.I., Kuznetsov, S.O., Stupnikov, S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2023. Communications in Computer and Information Science, vol 2086. Springer, Cham. https://doi.org/10.1007/978-3-031-67826-4_15
  • [6] D’Addona, M., Riccio, G., Cavuoti, S., Tortora, C., Brescia, M. (2021). Anomaly Detection in Astrophysics: A Comparison Between Unsupervised Deep and Machine Learning on KiDS Data. In: Zelinka, I., Brescia, M., Baron, D. (eds) Intelligent Astrophysics. Emergence, Complexity and Computation, vol 39. Springer, Cham. https://doi.org/10.1007/978-3-030-65867-0_10
  • [7] Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20(2), 215–242.
  • [8] Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • [9] Quinlan, R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106. https://doi.org/10.1007/BF00116251
  • [10] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
  • [11] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2016), 785–794. https://doi.org/10.1145/2939672.2939785
  • [12] Statlog (Shuttle) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5WS31.
  • [13] Domingues, R., Filippone, M., Michiardi, P., & Zouaoui, J. (2018). A Comparative Evaluation of Outlier Detection Algorithms: Experiments and Analyses. Pattern Recognition, 74, 406–421. https://doi.org/10.1016/j.patcog.2017.09.037
Toplam 13 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Bilgi Sistemleri (Diğer)
Bölüm Makaleler
Yazarlar

Ünal Dikbaş 0000-0002-7851-535X

Meral Ebegil 0000-0003-4798-3422

Erken Görünüm Tarihi 1 Ekim 2025
Yayımlanma Tarihi 2 Ekim 2025
Gönderilme Tarihi 26 Ağustos 2025
Kabul Tarihi 24 Eylül 2025
Yayımlandığı Sayı Yıl 2025 Sayı: ERKEN GÖRÜNÜM

Kaynak Göster

IEEE Ü. Dikbaş ve M. Ebegil, “Denetimli Makine Öğrenmesi Teknikleri ile Anomali Tespiti: Shuttle Uzay Verisi Örneği”, Savunma Bilimleri Dergisi, sy. ERKEN GÖRÜNÜM, ss. 1–1, Ekim2025, doi: 10.17134/khosbd.1772308.