Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot

Mustafa Can Bingol

doi:10.16984/saufenbilder.911942

Research Article

Year 2022, , 128 - 135, 28.02.2022

Mustafa Can Bingol

https://doi.org/10.16984/saufenbilder.911942

Abstract

References

[1] Y. Zheng, Q. Luo, H. Wang, C. Wang, and X. Chen, “Path planning of mobile robot based on adaptive ant colony algorithm,” J. Intell. Fuzzy Syst., vol. 39, no. 4, pp. 5329–5338, 2020.
[2] B. Song, Z. Wang, and L. Zou, “An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve,” Appl. Soft Comput., vol. 100, 2021.
[3] M. Zhao, H. Lu, S. Yang, Y. Guo, and F. Guo, “A fast robot path planning algorithm based on bidirectional associative learning,” Comput. Ind. Eng., vol. 155, no. October 2020, p. 107173, 2021.
[4] L. Larsen and J. Kim, “Path planning of cooperating industrial robots using evolutionary algorithms,” Robot. Comput. Integr. Manuf., vol. 67, no. August 2020, 2021.
[5] B. Fu et al., “An improved A* algorithm for the industrial robot path planning with high success rate and short length,” Rob. Auton. Syst., vol. 106, pp. 26–37, 2018.
[6] A. Krishna Lakshmanan et al., “Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot,” Autom. Constr., vol. 112, no. January, 2020.
[7] H. Xiong, T. Ma, L. Zhang, and X. Diao, “Comparison of end-to-end and hybrid deep reinforcement learning strategies for controlling cable-driven parallel robots,” Neurocomputing, vol. 377, pp. 73–84, 2020.
[8] M. Matulis and C. Harvey, “A robot arm digital twin utilising reinforcement learning,” Comput. Graph., vol. 95, pp. 106–114, 2021.
[9] J. Wang, S. Elfwing, and E. Uchibe, “Modular deep reinforcement learning from reward and punishment for robot navigation,” Neural Networks, vol. 135, pp. 115–126, 2021.
[10] Z. Bing, C. Lemke, L. Cheng, K. Huang, and A. Knoll, “Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning,” Neural Networks, vol. 129, pp. 323–333, 2020.
[11] Y. Tsurumine, Y. Cui, E. Uchibe, and T. Matsubara, “Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation,” Rob. Auton. Syst., vol. 112, pp. 72–83, 2019.
[12] I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, 2020.
[13] T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” in 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, 2016.
[14] “Dense Layer.” [Online]. Available: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense. [Accessed: 27-Mar-2021].

Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot

Year 2022, , 128 - 135, 28.02.2022

Mustafa Can Bingol

https://doi.org/10.16984/saufenbilder.911942

Abstract

Path planning is an essential topic of robotics studies. Robotic researchers have suggested some methods such as particle swarm optimization, A*, and reinforcement learning (RL) to obtain a path. In the current study, it was aimed to generate RL-based safe path planning for a 3R planar robot. For this purpose, firstly, the environment was performed. Later, state, action, reward, and terminate functions were determined. Lastly, actor and critic artificial neural networks (ANN), which are basic components of deep deterministic policy gradients (DDPG), were formed in order to generate a safe path. Another aim of the current study was to obtain an optimum actor ANN. Different ANN structures that have 2, 4, and 8-layers and 512, 1024, 2048, and 4096-units were formed to get an optimum actor ANN. These formed ANN structures were trained during 5000 episodes and 200 steps and the best results were obtained by 4-layer, 1024, and 2048-units structures. Owing to this reason, 4 different ANN structures were performed utilizing 4-layer, 1024, and 2048-units. The proposed structures were trained. The NET-M2U-4L structure generated the best result among 4 different proposed structures. The NET-M2U-4L structure was tested by using 1000 different scenarios. As a result of the tests, the rate of generating a safe path was calculated as 93.80% and the rate of colliding to the obstacle was computed as 1.70%. As a consequence, a safe path was planned and an optimum actor ANN was obtained for a 3R planar robot.

Keywords

artificial neural networks, Deep Deterministic Policy Gradients, path planning, reinforcement learning

References

[1] Y. Zheng, Q. Luo, H. Wang, C. Wang, and X. Chen, “Path planning of mobile robot based on adaptive ant colony algorithm,” J. Intell. Fuzzy Syst., vol. 39, no. 4, pp. 5329–5338, 2020.
[2] B. Song, Z. Wang, and L. Zou, “An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve,” Appl. Soft Comput., vol. 100, 2021.
[3] M. Zhao, H. Lu, S. Yang, Y. Guo, and F. Guo, “A fast robot path planning algorithm based on bidirectional associative learning,” Comput. Ind. Eng., vol. 155, no. October 2020, p. 107173, 2021.
[4] L. Larsen and J. Kim, “Path planning of cooperating industrial robots using evolutionary algorithms,” Robot. Comput. Integr. Manuf., vol. 67, no. August 2020, 2021.
[5] B. Fu et al., “An improved A* algorithm for the industrial robot path planning with high success rate and short length,” Rob. Auton. Syst., vol. 106, pp. 26–37, 2018.
[6] A. Krishna Lakshmanan et al., “Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot,” Autom. Constr., vol. 112, no. January, 2020.
[7] H. Xiong, T. Ma, L. Zhang, and X. Diao, “Comparison of end-to-end and hybrid deep reinforcement learning strategies for controlling cable-driven parallel robots,” Neurocomputing, vol. 377, pp. 73–84, 2020.
[8] M. Matulis and C. Harvey, “A robot arm digital twin utilising reinforcement learning,” Comput. Graph., vol. 95, pp. 106–114, 2021.
[9] J. Wang, S. Elfwing, and E. Uchibe, “Modular deep reinforcement learning from reward and punishment for robot navigation,” Neural Networks, vol. 135, pp. 115–126, 2021.
[10] Z. Bing, C. Lemke, L. Cheng, K. Huang, and A. Knoll, “Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning,” Neural Networks, vol. 129, pp. 323–333, 2020.
[11] Y. Tsurumine, Y. Cui, E. Uchibe, and T. Matsubara, “Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation,” Rob. Auton. Syst., vol. 112, pp. 72–83, 2019.
[12] I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, 2020.
[13] T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” in 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, 2016.
[14] “Dense Layer.” [Online]. Available: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense. [Accessed: 27-Mar-2021].

There are 14 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence
Journal Section	Research Articles
Authors	Mustafa Can Bingol 0000-0001-5448-8281
Publication Date	February 28, 2022
Submission Date	April 8, 2021
Acceptance Date	December 28, 2021
Published in Issue	Year 2022

Cite

APA	Bingol, M. C. (2022). Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. Sakarya University Journal of Science, 26(1), 128-135. https://doi.org/10.16984/saufenbilder.911942
AMA	Bingol MC. Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. SAUJS. February 2022;26(1):128-135. doi:10.16984/saufenbilder.911942
Chicago	Bingol, Mustafa Can. “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”. Sakarya University Journal of Science 26, no. 1 (February 2022): 128-35. https://doi.org/10.16984/saufenbilder.911942.
EndNote	Bingol MC (February 1, 2022) Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. Sakarya University Journal of Science 26 1 128–135.
IEEE	M. C. Bingol, “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”, SAUJS, vol. 26, no. 1, pp. 128–135, 2022, doi: 10.16984/saufenbilder.911942.
ISNAD	Bingol, Mustafa Can. “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”. Sakarya University Journal of Science 26/1 (February 2022), 128-135. https://doi.org/10.16984/saufenbilder.911942.
JAMA	Bingol MC. Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. SAUJS. 2022;26:128–135.
MLA	Bingol, Mustafa Can. “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”. Sakarya University Journal of Science, vol. 26, no. 1, 2022, pp. 128-35, doi:10.16984/saufenbilder.911942.
Vancouver	Bingol MC. Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. SAUJS. 2022;26(1):128-35.