Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot

Mustafa Can Bingol

doi:10.16984/saufenbilder.911942

Araştırma Makalesi

Yıl 2022, Cilt: 26 Sayı: 1, 128 - 135, 28.02.2022

Mustafa Can Bingol

https://doi.org/10.16984/saufenbilder.911942

Öz

Kaynakça

[1] Y. Zheng, Q. Luo, H. Wang, C. Wang, and X. Chen, “Path planning of mobile robot based on adaptive ant colony algorithm,” J. Intell. Fuzzy Syst., vol. 39, no. 4, pp. 5329–5338, 2020.
[2] B. Song, Z. Wang, and L. Zou, “An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve,” Appl. Soft Comput., vol. 100, 2021.
[3] M. Zhao, H. Lu, S. Yang, Y. Guo, and F. Guo, “A fast robot path planning algorithm based on bidirectional associative learning,” Comput. Ind. Eng., vol. 155, no. October 2020, p. 107173, 2021.
[4] L. Larsen and J. Kim, “Path planning of cooperating industrial robots using evolutionary algorithms,” Robot. Comput. Integr. Manuf., vol. 67, no. August 2020, 2021.
[5] B. Fu et al., “An improved A* algorithm for the industrial robot path planning with high success rate and short length,” Rob. Auton. Syst., vol. 106, pp. 26–37, 2018.
[6] A. Krishna Lakshmanan et al., “Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot,” Autom. Constr., vol. 112, no. January, 2020.
[7] H. Xiong, T. Ma, L. Zhang, and X. Diao, “Comparison of end-to-end and hybrid deep reinforcement learning strategies for controlling cable-driven parallel robots,” Neurocomputing, vol. 377, pp. 73–84, 2020.
[8] M. Matulis and C. Harvey, “A robot arm digital twin utilising reinforcement learning,” Comput. Graph., vol. 95, pp. 106–114, 2021.
[9] J. Wang, S. Elfwing, and E. Uchibe, “Modular deep reinforcement learning from reward and punishment for robot navigation,” Neural Networks, vol. 135, pp. 115–126, 2021.
[10] Z. Bing, C. Lemke, L. Cheng, K. Huang, and A. Knoll, “Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning,” Neural Networks, vol. 129, pp. 323–333, 2020.
[11] Y. Tsurumine, Y. Cui, E. Uchibe, and T. Matsubara, “Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation,” Rob. Auton. Syst., vol. 112, pp. 72–83, 2019.
[12] I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, 2020.
[13] T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” in 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, 2016.
[14] “Dense Layer.” [Online]. Available: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense. [Accessed: 27-Mar-2021].

Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot

Yıl 2022, Cilt: 26 Sayı: 1, 128 - 135, 28.02.2022

Mustafa Can Bingol

https://doi.org/10.16984/saufenbilder.911942

Öz

Path planning is an essential topic of robotics studies. Robotic researchers have suggested some methods such as particle swarm optimization, A*, and reinforcement learning (RL) to obtain a path. In the current study, it was aimed to generate RL-based safe path planning for a 3R planar robot. For this purpose, firstly, the environment was performed. Later, state, action, reward, and terminate functions were determined. Lastly, actor and critic artificial neural networks (ANN), which are basic components of deep deterministic policy gradients (DDPG), were formed in order to generate a safe path. Another aim of the current study was to obtain an optimum actor ANN. Different ANN structures that have 2, 4, and 8-layers and 512, 1024, 2048, and 4096-units were formed to get an optimum actor ANN. These formed ANN structures were trained during 5000 episodes and 200 steps and the best results were obtained by 4-layer, 1024, and 2048-units structures. Owing to this reason, 4 different ANN structures were performed utilizing 4-layer, 1024, and 2048-units. The proposed structures were trained. The NET-M2U-4L structure generated the best result among 4 different proposed structures. The NET-M2U-4L structure was tested by using 1000 different scenarios. As a result of the tests, the rate of generating a safe path was calculated as 93.80% and the rate of colliding to the obstacle was computed as 1.70%. As a consequence, a safe path was planned and an optimum actor ANN was obtained for a 3R planar robot.

Anahtar Kelimeler

artificial neural networks, Deep Deterministic Policy Gradients, path planning, reinforcement learning

Kaynakça

[1] Y. Zheng, Q. Luo, H. Wang, C. Wang, and X. Chen, “Path planning of mobile robot based on adaptive ant colony algorithm,” J. Intell. Fuzzy Syst., vol. 39, no. 4, pp. 5329–5338, 2020.
[2] B. Song, Z. Wang, and L. Zou, “An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve,” Appl. Soft Comput., vol. 100, 2021.
[3] M. Zhao, H. Lu, S. Yang, Y. Guo, and F. Guo, “A fast robot path planning algorithm based on bidirectional associative learning,” Comput. Ind. Eng., vol. 155, no. October 2020, p. 107173, 2021.
[4] L. Larsen and J. Kim, “Path planning of cooperating industrial robots using evolutionary algorithms,” Robot. Comput. Integr. Manuf., vol. 67, no. August 2020, 2021.
[5] B. Fu et al., “An improved A* algorithm for the industrial robot path planning with high success rate and short length,” Rob. Auton. Syst., vol. 106, pp. 26–37, 2018.
[6] A. Krishna Lakshmanan et al., “Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot,” Autom. Constr., vol. 112, no. January, 2020.
[7] H. Xiong, T. Ma, L. Zhang, and X. Diao, “Comparison of end-to-end and hybrid deep reinforcement learning strategies for controlling cable-driven parallel robots,” Neurocomputing, vol. 377, pp. 73–84, 2020.
[8] M. Matulis and C. Harvey, “A robot arm digital twin utilising reinforcement learning,” Comput. Graph., vol. 95, pp. 106–114, 2021.
[9] J. Wang, S. Elfwing, and E. Uchibe, “Modular deep reinforcement learning from reward and punishment for robot navigation,” Neural Networks, vol. 135, pp. 115–126, 2021.
[10] Z. Bing, C. Lemke, L. Cheng, K. Huang, and A. Knoll, “Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning,” Neural Networks, vol. 129, pp. 323–333, 2020.
[11] Y. Tsurumine, Y. Cui, E. Uchibe, and T. Matsubara, “Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation,” Rob. Auton. Syst., vol. 112, pp. 72–83, 2019.
[12] I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, 2020.
[13] T. P. Lillicrap et al., “Continuous control with deep reinforcement learning,” in 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, 2016.
[14] “Dense Layer.” [Online]. Available: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense. [Accessed: 27-Mar-2021].

Toplam 14 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Yapay Zeka
Bölüm	Araştırma Makalesi
Yazarlar	Mustafa Can Bingol 0000-0001-5448-8281
Erken Görünüm Tarihi	23 Şubat 2022
Yayımlanma Tarihi	28 Şubat 2022
Gönderilme Tarihi	8 Nisan 2021
Kabul Tarihi	28 Aralık 2021
Yayımlandığı Sayı	Yıl 2022 Cilt: 26 Sayı: 1

Kaynak Göster

APA	Bingol, M. C. (2022). Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. Sakarya University Journal of Science, 26(1), 128-135. https://doi.org/10.16984/saufenbilder.911942
AMA	Bingol MC. Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. SAUJS. Şubat 2022;26(1):128-135. doi:10.16984/saufenbilder.911942
Chicago	Bingol, Mustafa Can. “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”. Sakarya University Journal of Science 26, sy. 1 (Şubat 2022): 128-35. https://doi.org/10.16984/saufenbilder.911942.
EndNote	Bingol MC (01 Şubat 2022) Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. Sakarya University Journal of Science 26 1 128–135.
IEEE	M. C. Bingol, “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”, SAUJS, c. 26, sy. 1, ss. 128–135, 2022, doi: 10.16984/saufenbilder.911942.
ISNAD	Bingol, Mustafa Can. “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”. Sakarya University Journal of Science 26/1 (Şubat 2022), 128-135. https://doi.org/10.16984/saufenbilder.911942.
JAMA	Bingol MC. Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. SAUJS. 2022;26:128–135.
MLA	Bingol, Mustafa Can. “Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot”. Sakarya University Journal of Science, c. 26, sy. 1, 2022, ss. 128-35, doi:10.16984/saufenbilder.911942.
Vancouver	Bingol MC. Reinforcement Learning-Based Safe Path Planning for a 3R Planar Robot. SAUJS. 2022;26(1):128-35.