Research Article
BibTex RIS Cite

Veri Kümesi Performansı Üzerinde Izgara ve Rastgele Aramanın Kapsamlı Analizi

Year 2024, Volume: 7 Issue: 2, 77 - 83, 31.12.2024
https://doi.org/10.55581/ejeas.1581494

Abstract

Bu makale, makine öğreniminde kullanılan iki ana hiperparametre arama yöntemi olan ızgara arama ve rastgele arama yöntemlerinin kapsamlı bir karşılaştırmasını sunmaktadır. Makale, bu iki yöntemin performansını verimlilik, ölçeklenebilirlik ve farklı makine öğrenimi modelleri ve veri kümeleri üzerinde uygulanabilirlik açısından analiz etmektedir. Makalede, ızgara aramanın düzenli bir ızgara üzerinde tüm hiperparametre kombinasyonlarını aradığı için kapsamlı bir arama sağladığı, ancak yüksek hesaplama maliyeti yarattığı vurgulanmaktadır. Öte yandan, rastgele arama hiperparametre uzayından rastgele örnekler seçerek daha hızlı sonuçlar sağlarken, tam kapsam sağlamama dezavantajına sahiptir. Gerçek dünya uygulamalarında hangi arama yönteminin tercih edilmesi gerektiğine dair pratik öneriler ve karar verme süreçleri de sunulmuştur. Sonuç olarak makale, modelin karmaşıklığı, hiperparametre uzayının büyüklüğü ve mevcut hesaplama kaynakları gibi faktörlere göre grid arama ve rastgele aramanın avantajlı olabileceği durumları özetlemekte ve uygulayıcılar için kapsamlı bir rehber sunmayı amaçlamaktadır.

References

  • Mekonnen, T. (2019). Random vs. Directed Search for Scarce Resources. International Journal of Advanced Computer Science and Applications,269-278.
  • Lawrence, J. P., & Steiglitz, K. (1972). Randomized Pattern Search. IEEE Transactions on Computers, C–21(4), 382–385.
  • Vincent, P., & Rubin, I. (2004). Cooperative search versus random search using UAV swarms. IFAC Proceedings Volumes (IFAC-PapersOnline), 37(8), 944–949.
  • Bergstra, J., Ca, J. B., & Ca, Y. B. (2012). Random search for hyper-parameter optimization. The Journal of Machine Learning Research, 13, 281–305.
  • Aszemi, N. M., & Dominic, P. D. D. (2019). Hyperparameter Optimization in Convolutional Neural Network using Genetic Algorithms. International Journal of Advanced Computer Science and Applications, 10(6).
  • Sudhakaran, P., & Baitalik, S. (2022). XGBoost Optimized by Adaptive Tree Parzen Estimators for Credit Risk Analysis. 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), (pp. 1-6). IEEE.
  • Japa, L., Serqueira, M., Mendonca, I., Aritsugi, M., Bezerra, E., & Gonzalez, P. H. (2023). A Population-Based Hybrid Approach for Hyperparameter Optimization of Neural Networks. IEEE Access, 11, 50752–50768.
  • Zhao, Z., & Bai, T. (2022). Financial Fraud Detection and Prediction in Listed Companies Using SMOTE and Machine Learning Algorithms. Entropy, 24(8), 1157.
  • Y Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
  • Unwin, A., & Kleinman, K. (2021). The Iris Data Set: In Search of the Source of Virginica. Significance, 18(6), 26–29.
  • Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  • Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99-127.

Comprehensive Analysis of Grid and Randomized Search on Dataset Performance

Year 2024, Volume: 7 Issue: 2, 77 - 83, 31.12.2024
https://doi.org/10.55581/ejeas.1581494

Abstract

This paper presents a comprehensive comparison of grid search and randomized search, the two main hyperparameter search methods used in machine learning. The paper analyses the performance of these two methods in terms of efficiency, scalability and applicability on different machine learning models and datasets. In the paper, it is emphasized that grid search provides a comprehensive search since it searches all hyperparameter combinations on a regular grid, but it creates high computational cost. On the other hand, while random search provides faster results by selecting random samples from the hyperparameter space, it has the disadvantage of not providing complete coverage. Practical suggestions and decision-making processes are also presented for which search method should be preferred in real-world applications. In conclusion, the paper summarizes the situations where grid search and random search can be advantageous according to factors such as the complexity of the model, the size of the hyperparameter space and the available computational resources and aims to provide a comprehensive guide for practitioners.

References

  • Mekonnen, T. (2019). Random vs. Directed Search for Scarce Resources. International Journal of Advanced Computer Science and Applications,269-278.
  • Lawrence, J. P., & Steiglitz, K. (1972). Randomized Pattern Search. IEEE Transactions on Computers, C–21(4), 382–385.
  • Vincent, P., & Rubin, I. (2004). Cooperative search versus random search using UAV swarms. IFAC Proceedings Volumes (IFAC-PapersOnline), 37(8), 944–949.
  • Bergstra, J., Ca, J. B., & Ca, Y. B. (2012). Random search for hyper-parameter optimization. The Journal of Machine Learning Research, 13, 281–305.
  • Aszemi, N. M., & Dominic, P. D. D. (2019). Hyperparameter Optimization in Convolutional Neural Network using Genetic Algorithms. International Journal of Advanced Computer Science and Applications, 10(6).
  • Sudhakaran, P., & Baitalik, S. (2022). XGBoost Optimized by Adaptive Tree Parzen Estimators for Credit Risk Analysis. 2022 IEEE 2nd Mysore Sub Section International Conference (MysuruCon), (pp. 1-6). IEEE.
  • Japa, L., Serqueira, M., Mendonca, I., Aritsugi, M., Bezerra, E., & Gonzalez, P. H. (2023). A Population-Based Hybrid Approach for Hyperparameter Optimization of Neural Networks. IEEE Access, 11, 50752–50768.
  • Zhao, Z., & Bai, T. (2022). Financial Fraud Detection and Prediction in Listed Companies Using SMOTE and Machine Learning Algorithms. Entropy, 24(8), 1157.
  • Y Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
  • Unwin, A., & Kleinman, K. (2021). The Iris Data Set: In Search of the Source of Virginica. Significance, 18(6), 26–29.
  • Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25.
  • Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99-127.
There are 12 citations in total.

Details

Primary Language English
Subjects Decision Support and Group Support Systems
Journal Section Research Articles
Authors

Nadir Subaşı 0000-0002-5657-9002

Publication Date December 31, 2024
Submission Date November 8, 2024
Acceptance Date November 28, 2024
Published in Issue Year 2024 Volume: 7 Issue: 2