TR
EN
Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm
Abstract
Reinforcement learning is a learning method that many creatures often unwittingly use to gain abilities such as eating and walking. Inspired by this learning method, machine learning researchers have reduced this learning method to subheadings as value learning and policy learning. In this study, the noise standard deviation of the deep deterministic policy gradient (DDPG) method, which is one of the policy learning algorithms, was examined to solve inverse kinematics of a 2 degrees-of-freedom planar robot. For this examination, 8 different functions were determined depending on the maximum value of the output of the action artificial neural network. Created artificial neural networks were trained by using these functions in 1000 iterations with 200 steps in each iteration. After the training, the statistical difference between the groups was examined and it was found that there was no statistical difference between the three best groups. For this reason, the best three groups were retrained 2500 iterations and 200 steps and tested for 100 different test scenarios after the training. After testing, the inverse kinematic equation of the 2 degrees-of-freedom planar robot with minimal errors was obtained with the help of artificial neural networks. In the light of the results, the importance of the choice of the standard deviation of noise and the correct range of selection was presented for researchers who will work in this field.
Keywords
References
- [1] Murphy, K.P. (2012). Machine learning: a probabilistic perspective. MIT press. https://doi.org/10.1109/pes.2005.1489456
- [2] Geron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media.
- [3] De La Bourdonnaye, F., Teuliere, C., Chateau, T. ve Triesch, J. (2019). Within Reach? Learning to touch objects without prior models. 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics, ICDL-EpiRob 2019, s. 93–8. https://doi.org/10.1109/DEVLRN.2019.8850702
- [4] Ghouri, U.H., Zafar, M.U., Bari, S., Khan, H. ve Khan, M.U. (2019). Attitude Control of Quad-copter using Deterministic Policy Gradient Algorithms (DPGA). 2019 2nd International Conference on Communication, Computing and Digital Systems, C-CODE 2019, IEEE. s. 149–53. https://doi.org/10.1109/C-CODE.2019.8681003
- [5] Wang, Y., Tong, J., Song, T.Y. ve Wan, Z.H. (2018). Unmanned surface vehicle course tracking control based on neural network and deep deterministic policy gradient algorithm. 2018 OCEANS - MTS/IEEE Kobe Techno-Oceans, OCEANS - Kobe 2018, IEEE. s. 3–7. https://doi.org/10.1109/OCEANSKOBE.2018.8559329
- [6] Tuyen, L.P. ve Chung, T.C. (2017). Controlling bicycle using deep deterministic policy gradient algorithm. 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence, URAI 2017, s. 413–7. https://doi.org/10.1109/URAI.2017.7992765
- [7] Wang, W.Y., Ma, F. ve Liu, J. (2019). Course tracking control for smart ships based on a deep deterministic policy gradient-based algorithm. ICTIS 2019 - 5th International Conference on Transportation Information and Safety, IEEE. s. 1400–4. https://doi.org/10.1109/ICTIS.2019.8883840
- [8] Shi, X., Guo, Z., Huang, J., Shen, Y. ve Xia, L. (2020). A Distributed Reward Algorithm for Inverse Kinematics of Arm Robot. Proceedings - 5th International Conference on Automation, Control and Robotics Engineering, CACRE 2020, s. 92–6. https://doi.org/10.1109/CACRE50138.2020.9230347
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Publication Date
June 27, 2021
Submission Date
February 1, 2021
Acceptance Date
April 30, 2021
Published in Issue
Year 2021 Volume: 9 Number: 2
APA
Bingol, M. C. (2021). Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, 9(2), 200-210. https://doi.org/10.29109/gujsc.872646
AMA
1.Bingol MC. Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. GUJS Part C. 2021;9(2):200-210. doi:10.29109/gujsc.872646
Chicago
Bingol, Mustafa Can. 2021. “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji 9 (2): 200-210. https://doi.org/10.29109/gujsc.872646.
EndNote
Bingol MC (June 1, 2021) Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9 2 200–210.
IEEE
[1]M. C. Bingol, “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”, GUJS Part C, vol. 9, no. 2, pp. 200–210, June 2021, doi: 10.29109/gujsc.872646.
ISNAD
Bingol, Mustafa Can. “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9/2 (June 1, 2021): 200-210. https://doi.org/10.29109/gujsc.872646.
JAMA
1.Bingol MC. Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. GUJS Part C. 2021;9:200–210.
MLA
Bingol, Mustafa Can. “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, vol. 9, no. 2, June 2021, pp. 200-1, doi:10.29109/gujsc.872646.
Vancouver
1.Mustafa Can Bingol. Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. GUJS Part C. 2021 Jun. 1;9(2):200-1. doi:10.29109/gujsc.872646
Cited By
Evaluatıon of DDPG and PPO Algorıthms for Bıpedal Robot Control
Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi
https://doi.org/10.47495/okufbed.1031976Evaluation of the Deep Q-Learning Models for Mobile Robot Path Planning Problem
Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji
https://doi.org/10.29109/gujsc.1455778
