Araştırma Makalesi
BibTex RIS Kaynak Göster
Yıl 2019, Cilt: 7 Sayı: 3, 235 - 244, 30.07.2019
https://doi.org/10.17694/bajece.532746

Öz

Kaynakça

  • [1] S.-H. Kim, C.-W. Roh, S.-C. Kang and M.-Y. Park, "Outdoor navigation of a mobile robot using differential GPS and curb detection," in Proceedings of IEEE international conference on Robotics and Automation, 2007.
  • [2] L. Moreno, J. M. Armingol, S. Garrido, A. D. L. Escalera and M. A. Salichs, "A genetic algorithm for mobile robot localization using ultrasonic sensors," Journal of Intelligent and Robotic Systems, vol. 34, no. 2, pp. 135-154, 2002.
  • [3] A. Sinha and P. Papadakis, "Mind the gap: Detection and traversability analysis of terrain gaps using LIDAR for safe robot navigation," Robotica, vol. 31, no. 7, pp. 1085-1101, 2013.
  • [4] S. J. Russell and P. Norvig, Artificial intelligence: a modern approach, Pearson Education Limited, 2016.
  • [5] R. E. Korf, Artificial intelligence search algorithms, Chapman & Hall/CRC, 2010.
  • [6] L. E. Kavraki, M. N. Kolountzakis and J.-C. Latombe, "Analysis of probabilistic roadmaps for path planning," in Proceedings international conference on robotics and automation, 1996.
  • [7] N. A. Melchior and R. Simmons, "Particle RRT for path planning with uncertainty," in Proceedings of IEEE international conference on robotics and automation, 2007.
  • [8] S. X. Yang and C. Luo, "A neural network approach to complete coverage path planning," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 718-724, 2004.
  • [9] M. Z. Malik, A. Eizad and M. U. Khan, Path planning algorithms for mobile robots: a comprehensive comparative analysis, LAP LAMBERT Academic Publishing, 2014.
  • [10] M. S. Alam, M. U. Rafique and M. U. Khan, "Mobile robot path planning in static environments using particle swarm optimization," International journal of computer science and electronics engineering, vol. 3, no. 3, 2015.
  • [11] R. S. Sutton and A. G. Barto, Reinforcement Learning: An introduction, MIT press, 2018.
  • [12] C. J. C. H. Watkins, "Learning from delayed rewards," King's College, Cambridge, Ph.D. thesis, 1989.
  • [13] K. L. Smart W, "Practical reinforcement learning in continuous spaces," in Proceedings of the seventeenth international conference on machine learning, 2000.
  • [14] K. L. Smart W, "Effective reinforcement learning for mobile robots," in Proceedings of the international conference on robotics and automation, 2002.
  • [15] D. Aranibar and P. Alsina, "Reinforcement learning-based-path planning for autonomous robots," ENRI: Encontrol Nacional de Robotica Inteligente, 2004.
  • [16] H. Boem and H. Cho, "A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning," IEEE Transaction on System, Man, and Cybernetics, vol. 25, pp. 464-477, 1995.
  • [17] N. Yung and C. Ye, "Self-learning fuzzy navigation of mobile vehicle," in Proceedings of the international conference on signal processing, 1996.
  • [18] G. Yang, E. Chen and C. An, "Mobile robot navigation using neural Q-learning," in IEEE proceedings of the third international conference on machine learning and cybernetics, 2004.
  • [19] K. Macek, I. Petrovic and N. Peric, "A reinforcement learning approach to obstacle avoidance of mobile robots," in 7th international workshop on advanced motion control, 2002.
  • [20] A. Martínez-Tenor, J. A. Fernández-Madrigal, A. Cruz-Martín and J. González-Jiménez, "Towards a common implementation of reinforcement learning for multiple robotic tasks," Expert Systems with Applications, vol. 100, pp. 246-259, 2018.
  • [21] H. Andrew, Alan Turing: The Enigma, Princeton University Press, 2012.
  • [22] R. Bellman, "A Markovian decision process," Journal of Mathematics and Mechanics, pp. 679-684, 1957.
  • [23] A. G. Barto, R. G. Sutton and C. W. Anderson, "Neuronlike elements that can solve difficult learning control problems," IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, pp. 835-846, 1983.
  • [24] A. G. Barto, R. S. Sutton and P. S. Brouwer, "Associative search network: A reinforcement learning associative memory," Biological Cybernetics, pp. 201-211, 1981.
  • [25] G. A. Rummery and M. Niranjan, On-line Q-learning using connectionist system, vol. 37, Cambridge, England: University of Cambridge, Department of Engineering, 1994.
  • [26] H. Van Seijen, M. A. R, P. M. Pilarski and M. M. C. a. S. R. S, "Ture online temporal-difference learning," The Journal of Machine Learning Research, vol. 17, no. 1, pp. 5057-5096, 2016.
  • [27] R. Abiyev, D. Ibrahim and B. Erin, "Navigation of mobile robots in the presence of obstacles," Advances in Engineering Software, vol. 41, pp. 1179-1186, 2010.
  • [28] A. I. Panov, K. S. Yakovlev and R. Suvorov, "Grid path planning with deep reinforcement learning: preliminary results," Procedia Computer Science, vol. 123, pp. 347-353, 2018.
  • [29] J. d. R. Millan, "Reinforcement learning of goal-directed obstacle-avoiding reaction strategies in an autonomous mobile robot," Robotics and Autonomous Systems, vol. 15, pp. 275-299, 1995.
  • [30] R. Bellman, "The theory of dynamic programming," RAND Corp Sanata Monica CA, 1954.

Mobile Robot Navigation Using Reinforcement Learning in Unknown Environments

Yıl 2019, Cilt: 7 Sayı: 3, 235 - 244, 30.07.2019
https://doi.org/10.17694/bajece.532746

Öz

In mobile robotics, navigation is considered as
one of the most primary tasks, which becomes more challenging during local
navigation when the environment is unknown. Therefore, the robot has to explore
utilizing the sensory information. Reinforcement learning (RL), a
biologically-inspired learning paradigm, has caught the attention of many as it
has the capability to learn autonomously in an unknown environment. However,
the randomized behavior of exploration, common in RL, increases computation
time and cost, hence making it less appealing for real-world scenarios. This
paper proposes an informed-biased softmax regression (iBSR) learning process
that introduce a heuristic-based cost function to ensure faster convergence.
Here, the action-selection is not considered as a random process, rather, is
based on the maximum probability function calculated using softmax regression.
Through experimental simulation scenario for navigation, the strength of the
proposed approach is tested and, for comparison and analysis purposes, the iBSR
learning process is evaluated against two benchmark algorithms.

Kaynakça

  • [1] S.-H. Kim, C.-W. Roh, S.-C. Kang and M.-Y. Park, "Outdoor navigation of a mobile robot using differential GPS and curb detection," in Proceedings of IEEE international conference on Robotics and Automation, 2007.
  • [2] L. Moreno, J. M. Armingol, S. Garrido, A. D. L. Escalera and M. A. Salichs, "A genetic algorithm for mobile robot localization using ultrasonic sensors," Journal of Intelligent and Robotic Systems, vol. 34, no. 2, pp. 135-154, 2002.
  • [3] A. Sinha and P. Papadakis, "Mind the gap: Detection and traversability analysis of terrain gaps using LIDAR for safe robot navigation," Robotica, vol. 31, no. 7, pp. 1085-1101, 2013.
  • [4] S. J. Russell and P. Norvig, Artificial intelligence: a modern approach, Pearson Education Limited, 2016.
  • [5] R. E. Korf, Artificial intelligence search algorithms, Chapman & Hall/CRC, 2010.
  • [6] L. E. Kavraki, M. N. Kolountzakis and J.-C. Latombe, "Analysis of probabilistic roadmaps for path planning," in Proceedings international conference on robotics and automation, 1996.
  • [7] N. A. Melchior and R. Simmons, "Particle RRT for path planning with uncertainty," in Proceedings of IEEE international conference on robotics and automation, 2007.
  • [8] S. X. Yang and C. Luo, "A neural network approach to complete coverage path planning," IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 718-724, 2004.
  • [9] M. Z. Malik, A. Eizad and M. U. Khan, Path planning algorithms for mobile robots: a comprehensive comparative analysis, LAP LAMBERT Academic Publishing, 2014.
  • [10] M. S. Alam, M. U. Rafique and M. U. Khan, "Mobile robot path planning in static environments using particle swarm optimization," International journal of computer science and electronics engineering, vol. 3, no. 3, 2015.
  • [11] R. S. Sutton and A. G. Barto, Reinforcement Learning: An introduction, MIT press, 2018.
  • [12] C. J. C. H. Watkins, "Learning from delayed rewards," King's College, Cambridge, Ph.D. thesis, 1989.
  • [13] K. L. Smart W, "Practical reinforcement learning in continuous spaces," in Proceedings of the seventeenth international conference on machine learning, 2000.
  • [14] K. L. Smart W, "Effective reinforcement learning for mobile robots," in Proceedings of the international conference on robotics and automation, 2002.
  • [15] D. Aranibar and P. Alsina, "Reinforcement learning-based-path planning for autonomous robots," ENRI: Encontrol Nacional de Robotica Inteligente, 2004.
  • [16] H. Boem and H. Cho, "A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning," IEEE Transaction on System, Man, and Cybernetics, vol. 25, pp. 464-477, 1995.
  • [17] N. Yung and C. Ye, "Self-learning fuzzy navigation of mobile vehicle," in Proceedings of the international conference on signal processing, 1996.
  • [18] G. Yang, E. Chen and C. An, "Mobile robot navigation using neural Q-learning," in IEEE proceedings of the third international conference on machine learning and cybernetics, 2004.
  • [19] K. Macek, I. Petrovic and N. Peric, "A reinforcement learning approach to obstacle avoidance of mobile robots," in 7th international workshop on advanced motion control, 2002.
  • [20] A. Martínez-Tenor, J. A. Fernández-Madrigal, A. Cruz-Martín and J. González-Jiménez, "Towards a common implementation of reinforcement learning for multiple robotic tasks," Expert Systems with Applications, vol. 100, pp. 246-259, 2018.
  • [21] H. Andrew, Alan Turing: The Enigma, Princeton University Press, 2012.
  • [22] R. Bellman, "A Markovian decision process," Journal of Mathematics and Mechanics, pp. 679-684, 1957.
  • [23] A. G. Barto, R. G. Sutton and C. W. Anderson, "Neuronlike elements that can solve difficult learning control problems," IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, pp. 835-846, 1983.
  • [24] A. G. Barto, R. S. Sutton and P. S. Brouwer, "Associative search network: A reinforcement learning associative memory," Biological Cybernetics, pp. 201-211, 1981.
  • [25] G. A. Rummery and M. Niranjan, On-line Q-learning using connectionist system, vol. 37, Cambridge, England: University of Cambridge, Department of Engineering, 1994.
  • [26] H. Van Seijen, M. A. R, P. M. Pilarski and M. M. C. a. S. R. S, "Ture online temporal-difference learning," The Journal of Machine Learning Research, vol. 17, no. 1, pp. 5057-5096, 2016.
  • [27] R. Abiyev, D. Ibrahim and B. Erin, "Navigation of mobile robots in the presence of obstacles," Advances in Engineering Software, vol. 41, pp. 1179-1186, 2010.
  • [28] A. I. Panov, K. S. Yakovlev and R. Suvorov, "Grid path planning with deep reinforcement learning: preliminary results," Procedia Computer Science, vol. 123, pp. 347-353, 2018.
  • [29] J. d. R. Millan, "Reinforcement learning of goal-directed obstacle-avoiding reaction strategies in an autonomous mobile robot," Robotics and Autonomous Systems, vol. 15, pp. 275-299, 1995.
  • [30] R. Bellman, "The theory of dynamic programming," RAND Corp Sanata Monica CA, 1954.
Toplam 30 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Yapay Zeka, Elektrik Mühendisliği
Bölüm Araştırma Makalesi
Yazarlar

Muhammad Umer Khan 0000-0002-9195-3477

Yayımlanma Tarihi 30 Temmuz 2019
Yayımlandığı Sayı Yıl 2019 Cilt: 7 Sayı: 3

Kaynak Göster

APA Khan, M. U. (2019). Mobile Robot Navigation Using Reinforcement Learning in Unknown Environments. Balkan Journal of Electrical and Computer Engineering, 7(3), 235-244. https://doi.org/10.17694/bajece.532746

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisansı