Research Article
BibTex RIS Cite

Nöral Adi Türevsel Denklemler ile Derin Pekiştirmeli Öğrenmeye Dayalı İç Mekan Görsel Navigasyonu

Year 2025, Volume: 27 Issue: 80, 224 - 233, 23.05.2025
https://doi.org/10.21205/deufmd.2025278008

Abstract

Son yıllarda, Derin Pekiştirmeli Öğrenme (DPÖ), mobil robot navigasyonun zorlu sorunlarını çözmek için umut vadeden bir yaklaşım olarak ortaya çıkmıştır. Bu çalışma, etkili eğitim ve bellek avantajı sunan Nöral Adi Türevsel Denklemler (NATD) kullanarak pekiştirmeli öğrenme yöntemini önermekte ve modelden bağımsız, noktadan-noktaya navigasyon için uygulamaktadır. NATD kullanımıyla, navigasyon performansında artış ve kaynak optimizasyonu ile adaptasyonda iyileştirme sağlanmıştır. Yaklaşımımızı doğrulamak için, gerçek dünya iç mekan sahneleri kullanılarak kapsamlı simulasyon çalışmaları yapıldı. Sonuçlar, önerdiğimiz NATD tabanlı metodolojinin, geleneksel ResNet ve CNN mimarilerine göre navigasyon performansını artırmada etkili olduğunu göstermiştir. Ayrıca, müfredat öğrenme stratejileri çalışmamıza entegre edilmiş ve ajanın aşamalı olarak daha karmaşık navigasyon senaryoları üzerinden öğrenmesi sağlanmıştır. Elde edilen sonuçlar, bu yaklaşım ile daha hızlı ve daha gürbüz pekiştirmeli öğrenmenin gerçeklenebildiğini göstermektedir.

References

  • [1] Ferreira, B., Reis, J. 2023. A Systematic Literature Review on the Application of Automation in Logistics, Logistics, Vol. 7, no. 4, p. 80, DOI: 10.3390/logistics7040080.
  • [2] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M. 2013. Playing Atari with Deep Reinforcement Learning. arXiv, http://arxiv.org/abs/1312.5602 (Accessed: Jun. 12, 2024).
  • [3] Fox, I., Lee, J., Pop-Busui, R., Wiens, J. 2020. Deep Reinforcement Learning for Closed-Loop Blood Glucose Control, arXiv, http://arxiv.org/abs/2009.09051 (Accessed: Jun. 12, 2024).
  • [4] Yang, S. 2023. Deep Reinforcement Learning for Portfolio Management, Knowledge-Based Systems, Vol.278, https://doi.org/10.1016/j.knosys.2023.110905.
  • [5] Liu, X.-Y., Yang, H., Chen, Q., Zhang, R., Yang, L., Xiao, B., Wang, C. 2022. FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance, arXiv, http://arxiv.org/abs/2011.09607. (Accessed: Jun. 12, 2024).
  • [6] Shi, H., Shi, L., Xu, M., Hwang, K.-S. 2020. End-to-End Navigation Strategy With Deep Reinforcement Learning for Mobile Robots, IEEE Trans. Ind. Inform., Vol. 16, no. 4, pp. 2393-2402, doi: 10.1109/TII.2019.2936167.
  • [7] Anderson, P., Chang, A., Chaplot, D.S., Dosovitskiy, A. et al. 2018. On Evaluation of Embodied Navigation Agents. arXiv, https://arxiv.org/abs/1807.06757 (Accessed: Sep. 23, 2023)
  • [8] López,, M.E., Bergasa, L. M., Escudero,M.S. 2003. Visually augmented POMDP for indoor robot navigation, in Applied Informatics, pp. 183-187.
  • [9] Gaya, J., Gon√ßalves, L., Duarte, A., Zanchetta, B., Drews-Jr, P., Botelho, S. 2016. Vision-based Obstacle Avoidance Using Deep Learning, doi: 10.1109/LARS-SBR.2016.9.
  • [10] Thrun, S. 2008. Simultaneous localization and mapping. In Robotics and cognitive approaches to spatial mapping, pp. 13-41, Springer Berlin Heidelberg.
  • [11] Balakrishnan, K., Chakravarty, P., Shrivastava, S. 2021. An A* Curriculum Approach to Reinforcement Learning for RGBD Indoor Robot Navigation. arXiv, http://arxiv.org/abs/2101.01774, (Accessed: Aug. 07, 2023).
  • [12] Stentz, A. 1994. Optimal and efficient path planning for partially-known environments, IEEE International Conference on Robotics and Automation, 1994, pp.3310-3317Vol.4. doi: 10.1109/ROBOT.1994.351061.
  • [13] LaValle, S.M. 1998. Rapidly-exploring random trees: A new tool for path planning, Technical Report 98-11, Comput. Sci. Dept, Iowa State Univ.
  • [14] Gasparetto, A., Boscariol, P., Lanzutti, A., Vidoni, R. 2012. Trajectory planning in robotics, Math. Comput. Sci., Vol. 6, no. 3, pp. 269-279.
  • [15] Bonin-Font, F., Ortiz, A., Oliver, G., 2008. Visual navigation for mobile robots: A survey, J. Intell. Robot. Syst., Vol. 53, no. 3, pp. 263-296.
  • [16] Pan, M., Liu, Y., Cao, J., Li, Y., Li, C., Chen, C.-H. 2020. Visual Recognition Based on Deep Learning for Navigation Mark Classification, IEEE Access, Vol. 8, pp. 32767-32775, doi: 10.1109/ACCESS.2020.2973856.
  • [17] Ayyalasomayajula R., Arun, A., Wu, C., Sharma, S., Sethi, A.R., Vasisht, D., Bharadiaet D. 2020. Deep Learning Based Wireless Localization for Indoor Navigation," 26th Annual International Conference on Mobile Computing and Networking, doi: 10.1145/3372224.3380894
  • [18] Ocaña, M., Bergasa, L. M., Sotelo, M., Flores, R. 2005. Indoor robot navigation using a POMDP based on WiFi and ultrasound observations, IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 2592-2597.
  • [19] Parker-Holder, J., Rajan, R., Song, X., Biedenkapp, A., Miao, Y., Eimer, T., Zhang, B., Nguyen B., et al., 2022. Automated Reinforcement Learning (AutoRL): A Survey and Open Problems, J. Artif. Intell. Res., Vol. 74, pp. 517-568, doi: 10.1613/jair.1.13596.
  • [20] LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep Learning, Nature, Vol. 521, pp. 436-444, doi: 10.1038/nature14539.
  • [21] He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep Residual Learning for Image Recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, doi: 10.1109/CVPR.2016.90.
  • [22] Krizhevsky, A., Sutskever, I., Hinton, G.E. 2012. ImageNet Classification with Deep Convolutional Neural Networks, in Advances in Neural Information Processing Systems, pp 1097-1105.
  • [23] Szegedy C., et al., 2015. Going deeper with convolutions, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1-9, 2015. doi: 10.1109/CVPR.2015.7298594.
  • [24] R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. 2018. Neural ordinary differential equations. Advances in neural information processing systems, 31.
  • [25] Ainsworth, S., Lowrey, K., Thickstun, J., Harchaoui, Z., Srinivasa, S. 2021. Faster Policy Learning with Continuous-Time Gradients. In Learning for Dynamics and Control, pp. 1054-1067, PMLR
  • [26] Yıldız, Ç., Heinonen, M., Lähdesmäki, H. 2021. Continuous-time model-based reinforcement learning. In International Conference on Machine Learning, pp. 12009-1201, PMLR.
  • [27] Meleshkova, Z., Ivanov, S.E., Ivanova, L. 2021. Application of Neural ODE with embedded hybrid method for robotic manipulator control, Procedia Comput. Sci., Vol. 193, pp. 314-324, doi: 10.1016/j.procs.2021.10.032.
  • [28] Du, J., Futoma, J., Doshi-Velez, F. 2020. Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs. Advances in Neural Information Processing Systems, 33, 19805-19816.
  • [29] Zhao, L., Miao, K., Gatsis, K., Papachristodoulou, A. 2024. NLBAC: A Neural Ordinary Differential Equations-based Framework for Stable and Safe Reinforcement Learning. arXiv http://arxiv.org/abs/2401.13148, (Accessed: Apr. 29, 2024).
  • [30] Haarnoja, T., Pong, V., Zhou, A., Dalal, M., Abbeel, P., Levine, S. 2018. Composable Deep Reinforcement Learning for Robotic Manipulation, IEEE International Conference on Robotics and Automation (ICRA), pp. 6244-6251. doi: 10.1109/ICRA.2018.8460756.
  • [31] Soviany, P., Ionescu, R.T., Rota, P., Sebe, N. 2021. Curriculum Learning: A Survey. International Journal of Computer Vision, Vol. 130:6,pp. 1526 - 1565, doi:10.1007/s11263-022-01611-x
  • [32] Zhao, R., Sun, X., Tresp, V., 2019. Maximum Entropy-Regularized Multi-Goal Reinforcement Learning, International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov, Eds., in Proceedings of Machine Learning Research, Vol. 97. PMLR, pp. 7553-7562.
  • [33] Ziebart, B. D. 2010. Modeling purposeful adaptive behavior with the principle of maximum causal entropy. Carnegie Mellon University.
  • [34] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O. 2017. Proximal Policy Optimization Algorithms, arXiv preprint arXiv:1707.06347.
  • [35] Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S., 2018. Gibson Env: Real-World Perception for Embodied Agents, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9068-9079. doi: 10.1109/CVPR.2018.00945.
  • [36] Coumans, E., Bai, Y. 2016. Pybullet, a python module for physics simulation for games, robotics and machine learning.

Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations

Year 2025, Volume: 27 Issue: 80, 224 - 233, 23.05.2025
https://doi.org/10.21205/deufmd.2025278008

Abstract

Recently, Deep Reinforcement Learning (DRL) has gained attention as a promising approach to tackle the challenging problem of mobile robot navigation. This study proposes reinforcement learning utilizing Neural Ordinary Differential Equations (NODEs), which offer effective training and memory capacities, and applies it to model-free point-to-point navigation task. Through the use of NODEs, we achieved improvements in navigation performance as well as enhancements in resource optimization and adaptation. Extensive simulation studies were conducted using real-world indoor scenes to validate our approach. Results effectively demonstrated the effectiveness of our proposed NODEs-based methodology in enhancing navigation performance compared to traditional ResNet and CNN architectures. Furthermore, curriculum learning strategies were integrated into our study to enable the agent to learn through progressively more complex navigation scenarios. The results obtained indicate that this approach facilitates faster and more robust reinforcement learning.

References

  • [1] Ferreira, B., Reis, J. 2023. A Systematic Literature Review on the Application of Automation in Logistics, Logistics, Vol. 7, no. 4, p. 80, DOI: 10.3390/logistics7040080.
  • [2] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M. 2013. Playing Atari with Deep Reinforcement Learning. arXiv, http://arxiv.org/abs/1312.5602 (Accessed: Jun. 12, 2024).
  • [3] Fox, I., Lee, J., Pop-Busui, R., Wiens, J. 2020. Deep Reinforcement Learning for Closed-Loop Blood Glucose Control, arXiv, http://arxiv.org/abs/2009.09051 (Accessed: Jun. 12, 2024).
  • [4] Yang, S. 2023. Deep Reinforcement Learning for Portfolio Management, Knowledge-Based Systems, Vol.278, https://doi.org/10.1016/j.knosys.2023.110905.
  • [5] Liu, X.-Y., Yang, H., Chen, Q., Zhang, R., Yang, L., Xiao, B., Wang, C. 2022. FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance, arXiv, http://arxiv.org/abs/2011.09607. (Accessed: Jun. 12, 2024).
  • [6] Shi, H., Shi, L., Xu, M., Hwang, K.-S. 2020. End-to-End Navigation Strategy With Deep Reinforcement Learning for Mobile Robots, IEEE Trans. Ind. Inform., Vol. 16, no. 4, pp. 2393-2402, doi: 10.1109/TII.2019.2936167.
  • [7] Anderson, P., Chang, A., Chaplot, D.S., Dosovitskiy, A. et al. 2018. On Evaluation of Embodied Navigation Agents. arXiv, https://arxiv.org/abs/1807.06757 (Accessed: Sep. 23, 2023)
  • [8] López,, M.E., Bergasa, L. M., Escudero,M.S. 2003. Visually augmented POMDP for indoor robot navigation, in Applied Informatics, pp. 183-187.
  • [9] Gaya, J., Gon√ßalves, L., Duarte, A., Zanchetta, B., Drews-Jr, P., Botelho, S. 2016. Vision-based Obstacle Avoidance Using Deep Learning, doi: 10.1109/LARS-SBR.2016.9.
  • [10] Thrun, S. 2008. Simultaneous localization and mapping. In Robotics and cognitive approaches to spatial mapping, pp. 13-41, Springer Berlin Heidelberg.
  • [11] Balakrishnan, K., Chakravarty, P., Shrivastava, S. 2021. An A* Curriculum Approach to Reinforcement Learning for RGBD Indoor Robot Navigation. arXiv, http://arxiv.org/abs/2101.01774, (Accessed: Aug. 07, 2023).
  • [12] Stentz, A. 1994. Optimal and efficient path planning for partially-known environments, IEEE International Conference on Robotics and Automation, 1994, pp.3310-3317Vol.4. doi: 10.1109/ROBOT.1994.351061.
  • [13] LaValle, S.M. 1998. Rapidly-exploring random trees: A new tool for path planning, Technical Report 98-11, Comput. Sci. Dept, Iowa State Univ.
  • [14] Gasparetto, A., Boscariol, P., Lanzutti, A., Vidoni, R. 2012. Trajectory planning in robotics, Math. Comput. Sci., Vol. 6, no. 3, pp. 269-279.
  • [15] Bonin-Font, F., Ortiz, A., Oliver, G., 2008. Visual navigation for mobile robots: A survey, J. Intell. Robot. Syst., Vol. 53, no. 3, pp. 263-296.
  • [16] Pan, M., Liu, Y., Cao, J., Li, Y., Li, C., Chen, C.-H. 2020. Visual Recognition Based on Deep Learning for Navigation Mark Classification, IEEE Access, Vol. 8, pp. 32767-32775, doi: 10.1109/ACCESS.2020.2973856.
  • [17] Ayyalasomayajula R., Arun, A., Wu, C., Sharma, S., Sethi, A.R., Vasisht, D., Bharadiaet D. 2020. Deep Learning Based Wireless Localization for Indoor Navigation," 26th Annual International Conference on Mobile Computing and Networking, doi: 10.1145/3372224.3380894
  • [18] Ocaña, M., Bergasa, L. M., Sotelo, M., Flores, R. 2005. Indoor robot navigation using a POMDP based on WiFi and ultrasound observations, IEEE/RSJ International Conference on Intelligent Robots and Systems, pp 2592-2597.
  • [19] Parker-Holder, J., Rajan, R., Song, X., Biedenkapp, A., Miao, Y., Eimer, T., Zhang, B., Nguyen B., et al., 2022. Automated Reinforcement Learning (AutoRL): A Survey and Open Problems, J. Artif. Intell. Res., Vol. 74, pp. 517-568, doi: 10.1613/jair.1.13596.
  • [20] LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep Learning, Nature, Vol. 521, pp. 436-444, doi: 10.1038/nature14539.
  • [21] He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep Residual Learning for Image Recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, doi: 10.1109/CVPR.2016.90.
  • [22] Krizhevsky, A., Sutskever, I., Hinton, G.E. 2012. ImageNet Classification with Deep Convolutional Neural Networks, in Advances in Neural Information Processing Systems, pp 1097-1105.
  • [23] Szegedy C., et al., 2015. Going deeper with convolutions, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1-9, 2015. doi: 10.1109/CVPR.2015.7298594.
  • [24] R. T., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. 2018. Neural ordinary differential equations. Advances in neural information processing systems, 31.
  • [25] Ainsworth, S., Lowrey, K., Thickstun, J., Harchaoui, Z., Srinivasa, S. 2021. Faster Policy Learning with Continuous-Time Gradients. In Learning for Dynamics and Control, pp. 1054-1067, PMLR
  • [26] Yıldız, Ç., Heinonen, M., Lähdesmäki, H. 2021. Continuous-time model-based reinforcement learning. In International Conference on Machine Learning, pp. 12009-1201, PMLR.
  • [27] Meleshkova, Z., Ivanov, S.E., Ivanova, L. 2021. Application of Neural ODE with embedded hybrid method for robotic manipulator control, Procedia Comput. Sci., Vol. 193, pp. 314-324, doi: 10.1016/j.procs.2021.10.032.
  • [28] Du, J., Futoma, J., Doshi-Velez, F. 2020. Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs. Advances in Neural Information Processing Systems, 33, 19805-19816.
  • [29] Zhao, L., Miao, K., Gatsis, K., Papachristodoulou, A. 2024. NLBAC: A Neural Ordinary Differential Equations-based Framework for Stable and Safe Reinforcement Learning. arXiv http://arxiv.org/abs/2401.13148, (Accessed: Apr. 29, 2024).
  • [30] Haarnoja, T., Pong, V., Zhou, A., Dalal, M., Abbeel, P., Levine, S. 2018. Composable Deep Reinforcement Learning for Robotic Manipulation, IEEE International Conference on Robotics and Automation (ICRA), pp. 6244-6251. doi: 10.1109/ICRA.2018.8460756.
  • [31] Soviany, P., Ionescu, R.T., Rota, P., Sebe, N. 2021. Curriculum Learning: A Survey. International Journal of Computer Vision, Vol. 130:6,pp. 1526 - 1565, doi:10.1007/s11263-022-01611-x
  • [32] Zhao, R., Sun, X., Tresp, V., 2019. Maximum Entropy-Regularized Multi-Goal Reinforcement Learning, International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov, Eds., in Proceedings of Machine Learning Research, Vol. 97. PMLR, pp. 7553-7562.
  • [33] Ziebart, B. D. 2010. Modeling purposeful adaptive behavior with the principle of maximum causal entropy. Carnegie Mellon University.
  • [34] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O. 2017. Proximal Policy Optimization Algorithms, arXiv preprint arXiv:1707.06347.
  • [35] Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S., 2018. Gibson Env: Real-World Perception for Embodied Agents, IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9068-9079. doi: 10.1109/CVPR.2018.00945.
  • [36] Coumans, E., Bai, Y. 2016. Pybullet, a python module for physics simulation for games, robotics and machine learning.
There are 36 citations in total.

Details

Primary Language English
Subjects Control Theoryand Applications, Autonomous Vehicle Systems
Journal Section Research Article
Authors

Berk Ağın 0000-0002-5431-5337

Güleser Kalaycı Demir 0000-0003-3808-5305

Early Pub Date May 12, 2025
Publication Date May 23, 2025
Submission Date July 2, 2024
Acceptance Date August 17, 2024
Published in Issue Year 2025 Volume: 27 Issue: 80

Cite

APA Ağın, B., & Kalaycı Demir, G. (2025). Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, 27(80), 224-233. https://doi.org/10.21205/deufmd.2025278008
AMA Ağın B, Kalaycı Demir G. Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations. DEUFMD. May 2025;27(80):224-233. doi:10.21205/deufmd.2025278008
Chicago Ağın, Berk, and Güleser Kalaycı Demir. “Indoor Visual Navigation Based on Deep Reinforcement Learning With Neural Ordinary Differential Equations”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi 27, no. 80 (May 2025): 224-33. https://doi.org/10.21205/deufmd.2025278008.
EndNote Ağın B, Kalaycı Demir G (May 1, 2025) Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27 80 224–233.
IEEE B. Ağın and G. Kalaycı Demir, “Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations”, DEUFMD, vol. 27, no. 80, pp. 224–233, 2025, doi: 10.21205/deufmd.2025278008.
ISNAD Ağın, Berk - Kalaycı Demir, Güleser. “Indoor Visual Navigation Based on Deep Reinforcement Learning With Neural Ordinary Differential Equations”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27/80 (May 2025), 224-233. https://doi.org/10.21205/deufmd.2025278008.
JAMA Ağın B, Kalaycı Demir G. Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations. DEUFMD. 2025;27:224–233.
MLA Ağın, Berk and Güleser Kalaycı Demir. “Indoor Visual Navigation Based on Deep Reinforcement Learning With Neural Ordinary Differential Equations”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, vol. 27, no. 80, 2025, pp. 224-33, doi:10.21205/deufmd.2025278008.
Vancouver Ağın B, Kalaycı Demir G. Indoor Visual Navigation Based on Deep Reinforcement Learning with Neural Ordinary Differential Equations. DEUFMD. 2025;27(80):224-33.

Dokuz Eylül Üniversitesi, Mühendislik Fakültesi Dekanlığı Tınaztepe Yerleşkesi, Adatepe Mah. Doğuş Cad. No: 207-I / 35390 Buca-İZMİR.