Destek Vektör Makineleri ile Büyük Ölçekli Verilerde Hassas Anomali Tespiti ve Optimizasyon Teknikleri

Kadir Turgut

doi:10.54047/bibted.1567073

Araştırma Makalesi

Precise Anomaly Detection and Optimization Techniques in Large-Scale Data with Support Vector Machines

Yıl 2025, Cilt: 6 Sayı: 1, 37 - 43, 22.07.2025

Kadir Turgut

https://doi.org/10.54047/bibted.1567073

Öz

This article discusses the use of Support Vector Machines (SVM) for anomaly detection in large-scale data. Anomaly detection is an important data mining and machine learning problem that aims to identify deviations from normal behavior. SVM is a widely used algorithm thanks to its powerful classification capabilities and flexible kernel functions, but its implementation in large data sets poses various difficulties. The article examines new approaches and optimization techniques to enable efficient and effective application of SVM on large-scale datasets. Methods such as the use of kernel metrics, parameter optimization, data subsetting and approximate techniques are detailed. Additionally, the use of fast and efficient SVM algorithms such as Pegasos and LIBSVM is discussed. Experimental studies have been conducted on various large-scale datasets to evaluate the effectiveness and efficiency of the proposed methods. The results obtained show that SVM can provide high accuracy and generalization ability in anomaly detection in large data sets. However, it has been emphasized that challenges such as computational cost, memory usage and data instability must be overcome by using optimized methods and new technologies.
In conclusion, the paper presents various optimization techniques and new approaches to improve the performance of SVM for anomaly detection in large data sets. Future research should aim to further improve the performance of SVM in areas such as deep learning techniques, combination of offline and online learning methods, and distributed computing techniques.

Anahtar Kelimeler

Support Vector Machines (SVM) , Anomaly detection , Large-scale datasets , Parameter optimization

Kaynakça

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Benoit Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., & Zhang, X. (2016). TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (pp. 265-283).
Akkuş, Ö., & Demir, E. (2016). İki düzeyli olasılık modellerinde klasik meta sezgisel optimizasyon tekniklerinin performansı üzerine bir çalışma. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi, 15(30), 107-131.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(Feb), 281-305. Bottou, L., & Lin, C. J. (2007). Support vector machine solvers. In Large scale kernel machines (pp. 301-320). MIT Press.
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 1-58.
Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3), 1-27.
Chatterjee, A., & Siarry, P. (2006). Nonlinear inertia weight variation for dynamic adaptation in particle swarm optimization. Computers & Operations Research, 33(3), 859-871.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
Çelik, Y., & Alaca, Y. (2021). Log analizinde derin öğrenme ile anomali tespiti. Yapay Zeka Uygulamalarında Güncel Konular ve Araştırmalar, 137-153.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Erfani, S. M., Rajasegarar, S., Karunasekera, S., & Leckie, C. (2016). High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition, 58, 121-134.
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J. T., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Neural information processing systems (pp. 2962-2970).
Gökdemir, A., & Çalhan, A. (2022). Deep learning and machine learning based anomaly detection in IoT. Gazi Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, 37(4), 1945-1956.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer Science & Business Media.
Joachims, T. (2006). Training linear SVMs in linear time. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 217-226).
Köse, U. (2019). Zeki Optimizasyon Tabanlı Destek Vektör Makineleri ile Diyabet Teşhisi. Politeknik Dergisi, 22(3), 557-566.
Laskov, P., Düssel, P., Schäfer, C., & Rieck, K. (2005). Learning intrusion detection: supervised or unsupervised?. In International Conference on Image Analysis and Processing (pp. 50-57). Springer, Berlin, Heidelberg.
OpenAI. (2024). ChatGPT (Sürüm 4o) [Yazılım]. https://openai.com
Smola, A. J., Bartlett, P., Schölkopf, B., Schuurmans, D. (2000). Generalized support vector machines. In Advances in large margin classifiers (pp. 135-146). MIT Press.
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10), 1345-1359.
Schölkopf, B., & Smola, A. J. (2001). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.
Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural computation, 13(7), 1443-1471.
Shalev-Shwartz, S., Singer, Y., Srebro, N., & Cotter, A. (2011). Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical programming, 127(1), 3-30.
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge university press.
Tax, D. M. J., & Duin, R. P. W. (2004). Support vector data description. Machine learning, 54(1), 45-66.
Tsang, I. W., Kwok, J. T., & Cheung, P. M. (2005). Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6(Apr), 363-392.
Vapnik, V. N. (1998). Statistical learning theory. Wiley.
Weston, J., Perez-Cruz, F., Bousquet, O., Chapelle, O., Elisseeff, A., & Schölkopf, B. (2003). Feature selection and transduction for prediction of molecular bioactivity for drug design. Bioinformatics, 19(6), 764-771.
Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on neural networks, 16(3), 645-678.
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S., & Stoica, I. (2012). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation (pp. 15-28).

Destek Vektör Makineleri ile Büyük Ölçekli Verilerde Hassas Anomali Tespiti ve Optimizasyon Teknikleri

Yıl 2025, Cilt: 6 Sayı: 1, 37 - 43, 22.07.2025

Kadir Turgut

https://doi.org/10.54047/bibted.1567073

Öz

Bu makalede, büyük ölçekli verilerde anomali tespiti için Destek Vektör Makineleri (SVM) kullanımı ele alınmıştır. Anomali tespiti, normal davranışlardan sapmaları belirlemeyi amaçlayan önemli bir veri madenciliği ve makine öğrenimi problemidir. SVM, güçlü sınıflandırma yetenekleri ve esnek kernel fonksiyonları sayesinde yaygın olarak kullanılan bir algoritmadır, ancak büyük veri setlerinde uygulanması çeşitli zorluklar barındırır. Makale, büyük ölçekli veri setlerinde SVM'nin verimli ve etkili bir şekilde uygulanmasını sağlamak için yeni yaklaşımlar ve optimizasyon tekniklerini incelemektedir. Çekirdek triklerinin kullanımı, parametre optimizasyonu, veri alt kümeleme ve yaklaşık teknikler gibi yöntemler detaylandırılmıştır. Ayrıca Pegasos ve LIBSVM gibi hızlı ve verimli SVM algoritmalarının kullanımı ele alınmıştır. Deneysel çalışmalar, önerilen yöntemlerin etkinliğini ve verimliliğini değerlendirmek için çeşitli büyük ölçekli veri setleri üzerinde gerçekleştirilmiştir. Elde edilen sonuçlar, SVM'nin büyük veri setlerinde anomali tespitinde yüksek doğruluk ve genelleme yeteneği sağlayabileceğini göstermektedir. Bununla birlikte, hesaplama maliyeti, bellek kullanımı ve veri dengesizliği gibi zorlukların optimize edilmiş yöntemler ve yeni teknolojiler kullanılarak aşılması gerektiği vurgulanmıştır.
Sonuç olarak, makale büyük veri setlerinde anomali tespiti için SVM'nin performansını artırmak amacıyla çeşitli optimizasyon teknikleri ve yeni yaklaşımlar sunmaktadır. Gelecekteki araştırmalar, derin öğrenme teknikleri, çevrimdışı ve çevrimiçi öğrenme yöntemlerinin kombinasyonu ve dağıtık hesaplama teknikleri gibi alanlarda SVM'nin performansını daha da artırmayı hedeflemelidir.

Anahtar Kelimeler

Destek Vektör Makineleri (SVM) , Anomali tespiti , Büyük ölçekli veri setleri , Parametre optimizasyonu

Kaynakça

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Benoit Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., & Zhang, X. (2016). TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (pp. 265-283).
Akkuş, Ö., & Demir, E. (2016). İki düzeyli olasılık modellerinde klasik meta sezgisel optimizasyon tekniklerinin performansı üzerine bir çalışma. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi, 15(30), 107-131.
Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(Feb), 281-305. Bottou, L., & Lin, C. J. (2007). Support vector machine solvers. In Large scale kernel machines (pp. 301-320). MIT Press.
Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 1-58.
Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3), 1-27.
Chatterjee, A., & Siarry, P. (2006). Nonlinear inertia weight variation for dynamic adaptation in particle swarm optimization. Computers & Operations Research, 33(3), 859-871.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
Çelik, Y., & Alaca, Y. (2021). Log analizinde derin öğrenme ile anomali tespiti. Yapay Zeka Uygulamalarında Güncel Konular ve Araştırmalar, 137-153.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
Erfani, S. M., Rajasegarar, S., Karunasekera, S., & Leckie, C. (2016). High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition, 58, 121-134.
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J. T., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Neural information processing systems (pp. 2962-2970).
Gökdemir, A., & Çalhan, A. (2022). Deep learning and machine learning based anomaly detection in IoT. Gazi Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, 37(4), 1945-1956.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer Science & Business Media.
Joachims, T. (2006). Training linear SVMs in linear time. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 217-226).
Köse, U. (2019). Zeki Optimizasyon Tabanlı Destek Vektör Makineleri ile Diyabet Teşhisi. Politeknik Dergisi, 22(3), 557-566.
Laskov, P., Düssel, P., Schäfer, C., & Rieck, K. (2005). Learning intrusion detection: supervised or unsupervised?. In International Conference on Image Analysis and Processing (pp. 50-57). Springer, Berlin, Heidelberg.
OpenAI. (2024). ChatGPT (Sürüm 4o) [Yazılım]. https://openai.com
Smola, A. J., Bartlett, P., Schölkopf, B., Schuurmans, D. (2000). Generalized support vector machines. In Advances in large margin classifiers (pp. 135-146). MIT Press.
Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10), 1345-1359.
Schölkopf, B., & Smola, A. J. (2001). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press.
Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural computation, 13(7), 1443-1471.
Shalev-Shwartz, S., Singer, Y., Srebro, N., & Cotter, A. (2011). Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical programming, 127(1), 3-30.
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge university press.
Tax, D. M. J., & Duin, R. P. W. (2004). Support vector data description. Machine learning, 54(1), 45-66.
Tsang, I. W., Kwok, J. T., & Cheung, P. M. (2005). Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research, 6(Apr), 363-392.
Vapnik, V. N. (1998). Statistical learning theory. Wiley.
Weston, J., Perez-Cruz, F., Bousquet, O., Chapelle, O., Elisseeff, A., & Schölkopf, B. (2003). Feature selection and transduction for prediction of molecular bioactivity for drug design. Bioinformatics, 19(6), 764-771.
Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on neural networks, 16(3), 645-678.
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S., & Stoica, I. (2012). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation (pp. 15-28).

Toplam 30 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Makine Öğrenme (Diğer), Yapay Zeka (Diğer)
Bölüm	Araştırma Makalesi
Yazarlar	Kadir Turgut 0000-0002-8577-0500
Gönderilme Tarihi	14 Ekim 2024
Kabul Tarihi	17 Temmuz 2025
Erken Görünüm Tarihi	21 Temmuz 2025
Yayımlanma Tarihi	22 Temmuz 2025
Yayımlandığı Sayı	Yıl 2025 Cilt: 6 Sayı: 1

Kaynak Göster

APA	Turgut, K. (2025). Destek Vektör Makineleri ile Büyük Ölçekli Verilerde Hassas Anomali Tespiti ve Optimizasyon Teknikleri. Bilgisayar Bilimleri ve Teknolojileri Dergisi, 6(1), 37-43. https://doi.org/10.54047/bibted.1567073

Makale Dosyaları

Tam Metin