Araştırma Makalesi
BibTex RIS Kaynak Göster

Veri Kümesi Performansı Üzerinde Izgara ve Rastgele Aramanın Kapsamlı Analizi

Yıl 2024, Cilt: 7 Sayı: 2, 77 - 83, 31.12.2024
https://doi.org/10.55581/ejeas.1581494

Öz

Bu makale, makine öğreniminde kullanılan iki ana hiperparametre arama yöntemi olan ızgara arama ve rastgele arama yöntemlerinin kapsamlı bir karşılaştırmasını sunmaktadır. Makale, bu iki yöntemin performansını verimlilik, ölçeklenebilirlik ve farklı makine öğrenimi modelleri ve veri kümeleri üzerinde uygulanabilirlik açısından analiz etmektedir. Makalede, ızgara aramanın düzenli bir ızgara üzerinde tüm hiperparametre kombinasyonlarını aradığı için kapsamlı bir arama sağladığı, ancak yüksek hesaplama maliyeti yarattığı vurgulanmaktadır. Öte yandan, rastgele arama hiperparametre uzayından rastgele örnekler seçerek daha hızlı sonuçlar sağlarken, tam kapsam sağlamama dezavantajına sahiptir. Gerçek dünya uygulamalarında hangi arama yönteminin tercih edilmesi gerektiğine dair pratik öneriler ve karar verme süreçleri de sunulmuştur. Sonuç olarak makale, modelin karmaşıklığı, hiperparametre uzayının büyüklüğü ve mevcut hesaplama kaynakları gibi faktörlere göre grid arama ve rastgele aramanın avantajlı olabileceği durumları özetlemekte ve uygulayıcılar için kapsamlı bir rehber sunmayı amaçlamaktadır.

Kaynakça

  • Mekonnen, T. (2019). Random vs. Directed Search for Scarce Resources. International Journal of Advanced Computer Science and Applications,269-278.
  • Lawrence, J. P., & Steiglitz, K. (1972). Randomized Pattern Search. IEEE Transactions on Computers, C–21(4), 382–385.
  • Vincent, P., & Rubin, I. (2004). Cooperative search versus random search using UAV swarms. IFAC Proceedings Volumes (IFAC-PapersOnline), 37(8), 944–949.
  • Bergstra, J., Ca, J. B., & Ca, Y. B. (2012). Random search for hyper-parameter optimization. The Journal of Machine Learning Research, 13, 281–305.
  • Aszemi, N. M., & Dominic, P. D. D. (2019). Hyperparameter Optimization in Convolutional Neural Network using Genetic Algorithms. International Journal of Advanced Computer Science and Applications, 10(6).
  • Sudhakaran, P., & Baitalik, S. (2022). XGBoost Optimized by Adaptive Tree Parzen Estimators for Credit Risk Analysis. 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), (pp. 1-6). IEEE.
  • Japa, L., Serqueira, M., Mendonca, I., Aritsugi, M., Bezerra, E., & Gonzalez, P. H. (2023). A Population-Based Hybrid Approach for Hyperparameter Optimization of Neural Networks. IEEE Access, 11, 50752–50768.
  • Zhao, Z., & Bai, T. (2022). Financial Fraud Detection and Prediction in Listed Companies Using SMOTE and Machine Learning Algorithms. Entropy, 24(8), 1157.
  • Y Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
  • Unwin, A., & Kleinman, K. (2021). The Iris Data Set: In Search of the Source of Virginica. Significance, 18(6), 26–29.
  • Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  • Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99-127.

Comprehensive Analysis of Grid and Randomized Search on Dataset Performance

Yıl 2024, Cilt: 7 Sayı: 2, 77 - 83, 31.12.2024
https://doi.org/10.55581/ejeas.1581494

Öz

This paper presents a comprehensive comparison of grid search and randomized search, the two main hyperparameter search methods used in machine learning. The paper analyses the performance of these two methods in terms of efficiency, scalability and applicability on different machine learning models and datasets. In the paper, it is emphasized that grid search provides a comprehensive search since it searches all hyperparameter combinations on a regular grid, but it creates high computational cost. On the other hand, while random search provides faster results by selecting random samples from the hyperparameter space, it has the disadvantage of not providing complete coverage. Practical suggestions and decision-making processes are also presented for which search method should be preferred in real-world applications. In conclusion, the paper summarizes the situations where grid search and random search can be advantageous according to factors such as the complexity of the model, the size of the hyperparameter space and the available computational resources and aims to provide a comprehensive guide for practitioners.

Kaynakça

  • Mekonnen, T. (2019). Random vs. Directed Search for Scarce Resources. International Journal of Advanced Computer Science and Applications,269-278.
  • Lawrence, J. P., & Steiglitz, K. (1972). Randomized Pattern Search. IEEE Transactions on Computers, C–21(4), 382–385.
  • Vincent, P., & Rubin, I. (2004). Cooperative search versus random search using UAV swarms. IFAC Proceedings Volumes (IFAC-PapersOnline), 37(8), 944–949.
  • Bergstra, J., Ca, J. B., & Ca, Y. B. (2012). Random search for hyper-parameter optimization. The Journal of Machine Learning Research, 13, 281–305.
  • Aszemi, N. M., & Dominic, P. D. D. (2019). Hyperparameter Optimization in Convolutional Neural Network using Genetic Algorithms. International Journal of Advanced Computer Science and Applications, 10(6).
  • Sudhakaran, P., & Baitalik, S. (2022). XGBoost Optimized by Adaptive Tree Parzen Estimators for Credit Risk Analysis. 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), (pp. 1-6). IEEE.
  • Japa, L., Serqueira, M., Mendonca, I., Aritsugi, M., Bezerra, E., & Gonzalez, P. H. (2023). A Population-Based Hybrid Approach for Hyperparameter Optimization of Neural Networks. IEEE Access, 11, 50752–50768.
  • Zhao, Z., & Bai, T. (2022). Financial Fraud Detection and Prediction in Listed Companies Using SMOTE and Machine Learning Algorithms. Entropy, 24(8), 1157.
  • Y Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
  • Unwin, A., & Kleinman, K. (2021). The Iris Data Set: In Search of the Source of Virginica. Significance, 18(6), 26–29.
  • Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  • Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99-127.
Toplam 12 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Karar Desteği ve Grup Destek Sistemleri
Bölüm Araştırma Makaleleri
Yazarlar

Nadir Subaşı 0000-0002-5657-9002

Yayımlanma Tarihi 31 Aralık 2024
Gönderilme Tarihi 8 Kasım 2024
Kabul Tarihi 28 Kasım 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 7 Sayı: 2