Research Article

Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm

Volume: 9 Number: 2 June 27, 2021
TR EN

Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm

Abstract

Reinforcement learning is a learning method that many creatures often unwittingly use to gain abilities such as eating and walking. Inspired by this learning method, machine learning researchers have reduced this learning method to subheadings as value learning and policy learning. In this study, the noise standard deviation of the deep deterministic policy gradient (DDPG) method, which is one of the policy learning algorithms, was examined to solve inverse kinematics of a 2 degrees-of-freedom planar robot. For this examination, 8 different functions were determined depending on the maximum value of the output of the action artificial neural network. Created artificial neural networks were trained by using these functions in 1000 iterations with 200 steps in each iteration. After the training, the statistical difference between the groups was examined and it was found that there was no statistical difference between the three best groups. For this reason, the best three groups were retrained 2500 iterations and 200 steps and tested for 100 different test scenarios after the training. After testing, the inverse kinematic equation of the 2 degrees-of-freedom planar robot with minimal errors was obtained with the help of artificial neural networks. In the light of the results, the importance of the choice of the standard deviation of noise and the correct range of selection was presented for researchers who will work in this field.

Keywords

References

  1. [1] Murphy, K.P. (2012). Machine learning: a probabilistic perspective. MIT press. https://doi.org/10.1109/pes.2005.1489456
  2. [2] Geron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media.
  3. [3] De La Bourdonnaye, F., Teuliere, C., Chateau, T. ve Triesch, J. (2019). Within Reach? Learning to touch objects without prior models. 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics, ICDL-EpiRob 2019, s. 93–8. https://doi.org/10.1109/DEVLRN.2019.8850702
  4. [4] Ghouri, U.H., Zafar, M.U., Bari, S., Khan, H. ve Khan, M.U. (2019). Attitude Control of Quad-copter using Deterministic Policy Gradient Algorithms (DPGA). 2019 2nd International Conference on Communication, Computing and Digital Systems, C-CODE 2019, IEEE. s. 149–53. https://doi.org/10.1109/C-CODE.2019.8681003
  5. [5] Wang, Y., Tong, J., Song, T.Y. ve Wan, Z.H. (2018). Unmanned surface vehicle course tracking control based on neural network and deep deterministic policy gradient algorithm. 2018 OCEANS - MTS/IEEE Kobe Techno-Oceans, OCEANS - Kobe 2018, IEEE. s. 3–7. https://doi.org/10.1109/OCEANSKOBE.2018.8559329
  6. [6] Tuyen, L.P. ve Chung, T.C. (2017). Controlling bicycle using deep deterministic policy gradient algorithm. 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence, URAI 2017, s. 413–7. https://doi.org/10.1109/URAI.2017.7992765
  7. [7] Wang, W.Y., Ma, F. ve Liu, J. (2019). Course tracking control for smart ships based on a deep deterministic policy gradient-based algorithm. ICTIS 2019 - 5th International Conference on Transportation Information and Safety, IEEE. s. 1400–4. https://doi.org/10.1109/ICTIS.2019.8883840
  8. [8] Shi, X., Guo, Z., Huang, J., Shen, Y. ve Xia, L. (2020). A Distributed Reward Algorithm for Inverse Kinematics of Arm Robot. Proceedings - 5th International Conference on Automation, Control and Robotics Engineering, CACRE 2020, s. 92–6. https://doi.org/10.1109/CACRE50138.2020.9230347

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

June 27, 2021

Submission Date

February 1, 2021

Acceptance Date

April 30, 2021

Published in Issue

Year 2021 Volume: 9 Number: 2

APA
Bingol, M. C. (2021). Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, 9(2), 200-210. https://doi.org/10.29109/gujsc.872646
AMA
1.Bingol MC. Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. GUJS Part C. 2021;9(2):200-210. doi:10.29109/gujsc.872646
Chicago
Bingol, Mustafa Can. 2021. “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji 9 (2): 200-210. https://doi.org/10.29109/gujsc.872646.
EndNote
Bingol MC (June 1, 2021) Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9 2 200–210.
IEEE
[1]M. C. Bingol, “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”, GUJS Part C, vol. 9, no. 2, pp. 200–210, June 2021, doi: 10.29109/gujsc.872646.
ISNAD
Bingol, Mustafa Can. “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 9/2 (June 1, 2021): 200-210. https://doi.org/10.29109/gujsc.872646.
JAMA
1.Bingol MC. Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. GUJS Part C. 2021;9:200–210.
MLA
Bingol, Mustafa Can. “Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, vol. 9, no. 2, June 2021, pp. 200-1, doi:10.29109/gujsc.872646.
Vancouver
1.Mustafa Can Bingol. Investigation of the Standard Deviation of Ornstein - Uhlenbeck Noise in the DDPG Algorithm. GUJS Part C. 2021 Jun. 1;9(2):200-1. doi:10.29109/gujsc.872646

Cited By

                                TRINDEX     16167        16166    21432    logo.png

      

    e-ISSN:2147-9526