Q-Learning Algoritmasının Öğrenme Hızı Parametresi için Kendine Uyarlamalı Yöntemler parametresi
Year 2023,
Volume: 6 Issue: 2, 191 - 198, 23.09.2023
Murat Erhan Çimen
,
Zeynep Garip
,
Yaprak Yalçın
,
Mustafa Kutlu
,
Ali Fuat Boz
Abstract
Makine öğrenmesi yöntemleri genel olarak denetimli, denetimsiz ve takviyeli öğrenme olarak sınıflandırılabilir. Bu yöntemlerden biri olan takviyeli öğrenme içerisinde bulunan Q learning algoritması ortamla etkileşime girerek ortamdan öğrenebilen ve ona göre aksiyonlar üretebilen bir algoritmadır. Bu çalışmada Q learning algoritması içerisinde bulunan öğrenme parametresinin değeri için 8 farklı yöntem önerilmiştir. Önerilen yöntemlerin performanslarının test edilebilmesi için donmuş göl ve ters sarkaç sistemlerine bu algoritmalar uygulanmış ve sonuçları grafiksel ve istatistiksel olarak karşılaştırılmıştır. Elde edilen sonuçlar incelendiğinde ayrık bir sistem olan Donmuş Göl sistemi için Metot 1 daha iyi performans sergilerken sürekli bir sistem olan Ters Sarkaç Sistemi için Metot 7 daha iyi sonuç göstermiştir.
References
- Adigüzel, F., Yalçin, Y., 2018. Discrete-Time Backstepping Control for Cart-Pendulum System with Disturbance Attenuation via I&i Disturbance Estimation. in 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT).
- Adıgüzel, F., Yalçin, Y., 2022. “Backstepping Control for a Class of Underactuated Nonlinear Mechanical Systems with a Novel Coordinate Transformation in the Discrete-Time Setting.” in Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering.
- Akyurek, H.A., Bucak İ.Ö., 2012. Zamansal-Fark, Uyarlanır Dinamik Programlama ve SARSA Etmenlerinin Tipik Arazi Aracı Problemi Için Öğrenme Performansları. in Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu. Trabzon.
- Angiuli, A., Fouque J.P., Laurière M., 2022. Unified Reinforcement Q-Learning for Mean Field Game and Control Problems. Mathematics of Control, Signals, and Systems 34(2):217–71.
- Barlow, H. B., 1989. Unsupervised Learning. Neural Computation 1(3).
- Barto, A. G., Sutton R.S., Anderson C.W., 1983. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics 5(834–846).
- Bayraj, E. A., Kırcı, P., Ensari, T., Seven, E., Dağtekin, M., 2022. Göğüs Kanseri Verileri Üzerinde Makine Öğrenmesi Yöntemlerinin Uygulanması. Journal of Intelligent Systems: Theory and Applications 5(1):35–41.
- Bucak, I.Ö., Zohdy M. A., 1999. Application Of Reinforcement Learning Control To A Nonlinear Bouncing Cart. Pp. 1198–1202 in Proceedings of the American Control Conference. San Diego, California.
- Candan, F., Emir, S., Doğan, M., Kumbasar, T., 2018. Takviyeli Q-Öğrenme Yöntemiyle Labirent Problemi Çözümü Labyrinth Problem Solution with Reinforcement Q-Learning Method. in TOK2018 Otomatik Kontrol Ulusal Toplantısı.
- Chen, T., Chen, Y., He, Z., Li, E., Zhang, C., Huang., Y., 2022. A Novel Marine Predators Algorithm with Adaptive Update Strategy. He Journal of Supercomputing 1–34.
- Çimen, M.E., Garip, Z. Pala M.A., Boz, A.F., Akgül, A. 2019. Modelling of a Chaotic System Motion in Video with Artificial Neural Networks. Chaos Theory and Applications 1(1).
- Cimen, M.E., Yalçın, Y., 2022. A Novel Hybrid Firefly–Whale Optimization Algorithm and Its Application to Optimization of MPC Parameters, Soft Computing 26(4):1845–72.
- Cimen, M.E., Boyraz, O.F., Yildiz, M.Z., Boz, A.F., 2021. A New Dorsal Hand Vein Authentication System Based on Fractal Dimension Box Counting Method, Optik 226.
- Cunningham, P., Cord, M. Delany, S.J., 2008. Supervised Learning, Pp. 21–49 in Machine learning techniques for multimedia: case studies on organization and retrieval,.
- Ekinci, E., 2022. Classification of Imbalanced Offensive Dataset–Sentence Generation for Minority Class with LSTM, Sakarya University Journal of Computer and Information Sciences 5(1):121–33.
- Elallid, B. B., Benamar, N., Hafid, A. S., Rachidi, T., Mrani, N., 2022. A Comprehensive Survey on the Application of Deep and Reinforcement Learning Approaches in Autonomous Driving, Journal of King Saud University-Computer and Information Sciences.
- Grefenstette, J. J., 1993. Genetic Algorithms and Machine Learning, in Proceedings of the sixth annual conference on Computational learning theory.
- Jogunola, O., Adebisi, B., Ikpehai, A., Popoola, S. I., Gui, G., Gačanin, H., Ci. S., 2020. Consensus Algorithms and Deep Reinforcement Learning in Energy Market: A Review, IEEE Internet of Things Journal 8(6).
- Meng, T. L., Khushi, M., 2019. Reinforcement Learning in Financial Markets, Data 4(3).
- O’Neill, D., Levorato, M., Goldsmith, A., Mitra U., 2010. Residential Demand Response Using Reinforcement Learning, in 2010 First IEEE International Conference on Smart Grid Communications.
- Omurca, S. İ., Ekinci, E., Sevim, S., Edinç, E. B., Eken, A., Sayar, S., 2022. A Document Image Classification System Fusing Deep and Machine Learning Models, Applied Intelligence 1–16.
- Pala, M. A., Çimen, M. E., Boyraz, Ö. F., Yildiz, M. Z., Boz, A., 2019. Meme Kanserinin Teşhis Edilmesinde Karar Ağacı Ve KNN Algoritmalarının Karşılaştırmalı Başarım Analizi, Academic Perspective Procedia 2(3).
- Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Avcıoglu, E., Alaca, Y., 2022. CNN-Based Approach for Overlapping Erythrocyte Counting and Cell Type Classification in Peripheral Blood Images, Chaos Theory and Applications 4(2).
- Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Özkan, A.D., 2021. Holografik Görüntülerde Kenar Tabanlı Fraktal Özniteliklerin Hücre Canlılık Analizlerinde Başarısı, Journal of Smart Systems Research 2(2):89–94.
- Peng, J., Williams. R.J., 1996. Incremental Multi-Step Q-Learning.
- Sarızeybek, A. T., Sevli, O., 2022. Makine Öğrenmesi Yöntemleri Ile Banka Müşterilerinin Kredi Alma Eğiliminin Karşılaştırmalı Analizi. Journal of Intelligent Systems: Theory and Applications 5(2):137–44.
- Sathya, R., Abraham., A., 2013. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification, in (IJARAI) International Journal of Advanced Research in Artificial Intelligence.
- Singh, B., Kumar, R., Singh., V. P., 2022. Reinforcement Learning in Robotic Applications: A Comprehensive Survey, Artificial Intelligence Review 1–46.
- Smart, W.D., Kaelbling, L.P., 2000, Practical Reinforcement Learning in Continuous Spaces. ICML.
- Toğaçar, M., K. A. Eşidir, and B. Ergen. 2021. “Yapay Zekâ Tabanlı Doğal Dil İşleme Yaklaşımını Kullanarak İnternet Ortamında Yayınlanmış Sahte Haberlerin Tespiti.” Journal of Intelligent Systems: Theory and Applications 5(1):1–8.
- Wang, H., Emmerich, M., Plaat, A., Monte Carlo Q-Learning for General Game Playing, ArXiv Preprint ArXiv:1802.05944.
- Watkins, C. J. C. H., 1989. Learning from Delayed Rewards, Dissertation, King’s College UK.
- Watkins, C.J.C.H, Dayan P., 1992. Q-Learning, Machine Learning.
Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm
Year 2023,
Volume: 6 Issue: 2, 191 - 198, 23.09.2023
Murat Erhan Çimen
,
Zeynep Garip
,
Yaprak Yalçın
,
Mustafa Kutlu
,
Ali Fuat Boz
Abstract
Machine learning methods can generally be categorized as supervised, unsupervised and reinforcement learning. One of these methods, Q learning algorithm in reinforcement learning, is an algorithm that can interact with the environment and learn from the environment and produce actions accordingly. In this study, eight different on-line methods have been proposed to determine online the value of the learning parameter in the Q learning algorithm depending on different situations. In order to test the performance of the proposed methods, these algorithms are applied to Frozen Lake and Car Pole systems and the results are compared graphically and statistically. When the obtained results are examined, Method 1 has produced better performance for Frozen Lake, which is a discrete system, while Method 7 has produced better results for the Cart Pole System, which is a continuous system.
References
- Adigüzel, F., Yalçin, Y., 2018. Discrete-Time Backstepping Control for Cart-Pendulum System with Disturbance Attenuation via I&i Disturbance Estimation. in 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT).
- Adıgüzel, F., Yalçin, Y., 2022. “Backstepping Control for a Class of Underactuated Nonlinear Mechanical Systems with a Novel Coordinate Transformation in the Discrete-Time Setting.” in Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering.
- Akyurek, H.A., Bucak İ.Ö., 2012. Zamansal-Fark, Uyarlanır Dinamik Programlama ve SARSA Etmenlerinin Tipik Arazi Aracı Problemi Için Öğrenme Performansları. in Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu. Trabzon.
- Angiuli, A., Fouque J.P., Laurière M., 2022. Unified Reinforcement Q-Learning for Mean Field Game and Control Problems. Mathematics of Control, Signals, and Systems 34(2):217–71.
- Barlow, H. B., 1989. Unsupervised Learning. Neural Computation 1(3).
- Barto, A. G., Sutton R.S., Anderson C.W., 1983. Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics 5(834–846).
- Bayraj, E. A., Kırcı, P., Ensari, T., Seven, E., Dağtekin, M., 2022. Göğüs Kanseri Verileri Üzerinde Makine Öğrenmesi Yöntemlerinin Uygulanması. Journal of Intelligent Systems: Theory and Applications 5(1):35–41.
- Bucak, I.Ö., Zohdy M. A., 1999. Application Of Reinforcement Learning Control To A Nonlinear Bouncing Cart. Pp. 1198–1202 in Proceedings of the American Control Conference. San Diego, California.
- Candan, F., Emir, S., Doğan, M., Kumbasar, T., 2018. Takviyeli Q-Öğrenme Yöntemiyle Labirent Problemi Çözümü Labyrinth Problem Solution with Reinforcement Q-Learning Method. in TOK2018 Otomatik Kontrol Ulusal Toplantısı.
- Chen, T., Chen, Y., He, Z., Li, E., Zhang, C., Huang., Y., 2022. A Novel Marine Predators Algorithm with Adaptive Update Strategy. He Journal of Supercomputing 1–34.
- Çimen, M.E., Garip, Z. Pala M.A., Boz, A.F., Akgül, A. 2019. Modelling of a Chaotic System Motion in Video with Artificial Neural Networks. Chaos Theory and Applications 1(1).
- Cimen, M.E., Yalçın, Y., 2022. A Novel Hybrid Firefly–Whale Optimization Algorithm and Its Application to Optimization of MPC Parameters, Soft Computing 26(4):1845–72.
- Cimen, M.E., Boyraz, O.F., Yildiz, M.Z., Boz, A.F., 2021. A New Dorsal Hand Vein Authentication System Based on Fractal Dimension Box Counting Method, Optik 226.
- Cunningham, P., Cord, M. Delany, S.J., 2008. Supervised Learning, Pp. 21–49 in Machine learning techniques for multimedia: case studies on organization and retrieval,.
- Ekinci, E., 2022. Classification of Imbalanced Offensive Dataset–Sentence Generation for Minority Class with LSTM, Sakarya University Journal of Computer and Information Sciences 5(1):121–33.
- Elallid, B. B., Benamar, N., Hafid, A. S., Rachidi, T., Mrani, N., 2022. A Comprehensive Survey on the Application of Deep and Reinforcement Learning Approaches in Autonomous Driving, Journal of King Saud University-Computer and Information Sciences.
- Grefenstette, J. J., 1993. Genetic Algorithms and Machine Learning, in Proceedings of the sixth annual conference on Computational learning theory.
- Jogunola, O., Adebisi, B., Ikpehai, A., Popoola, S. I., Gui, G., Gačanin, H., Ci. S., 2020. Consensus Algorithms and Deep Reinforcement Learning in Energy Market: A Review, IEEE Internet of Things Journal 8(6).
- Meng, T. L., Khushi, M., 2019. Reinforcement Learning in Financial Markets, Data 4(3).
- O’Neill, D., Levorato, M., Goldsmith, A., Mitra U., 2010. Residential Demand Response Using Reinforcement Learning, in 2010 First IEEE International Conference on Smart Grid Communications.
- Omurca, S. İ., Ekinci, E., Sevim, S., Edinç, E. B., Eken, A., Sayar, S., 2022. A Document Image Classification System Fusing Deep and Machine Learning Models, Applied Intelligence 1–16.
- Pala, M. A., Çimen, M. E., Boyraz, Ö. F., Yildiz, M. Z., Boz, A., 2019. Meme Kanserinin Teşhis Edilmesinde Karar Ağacı Ve KNN Algoritmalarının Karşılaştırmalı Başarım Analizi, Academic Perspective Procedia 2(3).
- Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Avcıoglu, E., Alaca, Y., 2022. CNN-Based Approach for Overlapping Erythrocyte Counting and Cell Type Classification in Peripheral Blood Images, Chaos Theory and Applications 4(2).
- Pala, M.A., Cimen, M.E., Yıldız, M.Z. Cetinel, G., Özkan, A.D., 2021. Holografik Görüntülerde Kenar Tabanlı Fraktal Özniteliklerin Hücre Canlılık Analizlerinde Başarısı, Journal of Smart Systems Research 2(2):89–94.
- Peng, J., Williams. R.J., 1996. Incremental Multi-Step Q-Learning.
- Sarızeybek, A. T., Sevli, O., 2022. Makine Öğrenmesi Yöntemleri Ile Banka Müşterilerinin Kredi Alma Eğiliminin Karşılaştırmalı Analizi. Journal of Intelligent Systems: Theory and Applications 5(2):137–44.
- Sathya, R., Abraham., A., 2013. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification, in (IJARAI) International Journal of Advanced Research in Artificial Intelligence.
- Singh, B., Kumar, R., Singh., V. P., 2022. Reinforcement Learning in Robotic Applications: A Comprehensive Survey, Artificial Intelligence Review 1–46.
- Smart, W.D., Kaelbling, L.P., 2000, Practical Reinforcement Learning in Continuous Spaces. ICML.
- Toğaçar, M., K. A. Eşidir, and B. Ergen. 2021. “Yapay Zekâ Tabanlı Doğal Dil İşleme Yaklaşımını Kullanarak İnternet Ortamında Yayınlanmış Sahte Haberlerin Tespiti.” Journal of Intelligent Systems: Theory and Applications 5(1):1–8.
- Wang, H., Emmerich, M., Plaat, A., Monte Carlo Q-Learning for General Game Playing, ArXiv Preprint ArXiv:1802.05944.
- Watkins, C. J. C. H., 1989. Learning from Delayed Rewards, Dissertation, King’s College UK.
- Watkins, C.J.C.H, Dayan P., 1992. Q-Learning, Machine Learning.