Research Article
BibTex RIS Cite

Integrating Proximal Policy Optimization with Physically Realistic Simulation for Robust Autonomous Underwater Vehicle Control

Year 2026, Volume: 13 Issue: 1 , 165 - 199 , 31.03.2026
https://doi.org/10.54287/gujsa.1813751
https://izlik.org/JA58JP52YD

Abstract

This study presents the design and implementation of a reinforcement learning (RL)-based framework for the control of an autonomous underwater vehicle (AUV) directly within Unreal Engine (UE). A high-fidelity aquatic environment was created using UE’s native Water System to simulate hydrodynamic forces and buoyancy. Unlike studies assuming continuous control, this research addresses the challenge of stabilizing an AUV subject to severe discrete 'bang-bang' hardware constraints. A parallelized Proximal Policy Optimization (PPO) algorithm was employed to synthesize adaptive control policies. Comparative analysis against tuned Proportional-Integral-Derivative (PID) baselines demonstrates that the RL agent outperforms classical methods in three key metrics: (1) in longitudinal navigation, the agent learned an emergent "pulsing" strategy—mimicking Pulse-Width Modulation (PWM)—to overcome these discrete actuation constraints, reducing steady-state error compared to the Proportional-Derivative (PD) baseline; (2) in vertical depth control, the agent autonomously learned gravity compensation, settling faster than integral-based controllers while avoiding buoyancy-induced stalling; and (3) in heading control, the agent demonstrated superior dynamic handling, completing stabilization maneuvers faster than the baseline. The key architectural innovation lies in the direct integration of UE’s Learning Agents plugin, eliminating the need for external middleware. This native integration enables real-time synchronization between simulation physics and learning processes, establishing a high-fidelity platform for developing intelligent underwater control systems.

References

  • Amer, A., Álvarez-Tuñón, O., Uğurlu, H. İ., Le Fevre Sejersen, J., Brodskiy, Y., & Kayacan, E. (2023, December 5-8). UNav-Sim: A visually realistic underwater robotics simulator and synthetic data-generation framework. 2023 21st International Conference on Advanced Robotics (ICAR), 570–576. Abu Dhabi, UAE. https://doi.org/10.1109/ICAR58858.2023.10406819
  • Behrje, U., Amory, A., Meyer, B., & Maehle, E. (2018, June 20–21). System identification and sliding mode depth control of the micro AUV SEMBIO. Proceedings of the 50th International Symposium on Robotics (ISR 2018), 344–351. Munich, Germany.
  • Benjamin, M. R., Schmidt, H., Newman, P. M., & Leonard, J. J. (2010). Nested autonomy for unmanned marine vehicles with MOOS-IvP. Journal of Field Robotics, 27(5), 834–875.
  • Cai, L., Chang, K., & Girdhar, Y. (2025). Learning to swim: Reinforcement learning for 6-DOF control of thruster-driven autonomous underwater vehicles. 2025 IEEE International Conference on Robotics and Automation (ICRA). https://arxiv.org/abs/2410.00120
  • Chamusca, I. L., De Jesus Santos, F. V., Ferreira, C. V., Murari, T. B., Apolinario Junior, A. L., & Winkler, I. (2022). Evaluation of design guidelines for the development of intuitive virtual reality authoring tools: A case study with NVIDIA Omniverse. 2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), 419–424. Singapore. https://doi.org/10.1109/ISMAR-Adjunct57072.2022.00078
  • Eriksson, J., & Wingård, J. (2022). Improving the accuracy of FFT-based GPGPU ocean surface simulations [MSc Thesis, Chalmers University of Technology and University of Gothenburg].
  • Farhang, A. R., Mulcahy, B., Holden, D., Matthews, I., & Yue, Y. (2024). Humanlike behavior in a third-person shooter with imitation learning. 2024 IEEE Conference on Games (CoG), 1–4. Milan, Italy. https://doi.org/10.1109/CoG60054.2024.10645651
  • Fossen, T. I. (2011). Handbook of marine craft hydrodynamics and motion control. Chichester, UK: John Wiley & Sons.
  • Gammell, J. D., Srinivasa, S. S., & Barfoot, T. D. (2014, September 14–18). Informed RRT*: Optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2997–3004. Chicago, IL, USA. https://doi.org/10.1109/IROS.2014.6942976
  • Hadi, B., Khosravi, A., & Sarhadi, P. (2022). Deep reinforcement learning for adaptive path planning and control of an autonomous underwater vehicle. Applied Ocean Research, 129, 103326. https://doi.org/10.1016/j.apor.2022.103326
  • LaValle, S. M., & Kuffner, J. J. (2001). Rapidly-exploring random trees: Progress and prospects. In B. R. Donald, K. M. Lynch, & D. Rus (Eds.), Algorithmic and computational robotics: New directions (pp. 293–308). Wellesley, MA, USA: A K Peters.
  • Manhães, M. M. M., Scherer, S. A., Voss, M., Douat, L. R., & Rauschenbach, T. (2016). UUV simulator: A Gazebo-based package for underwater intervention and multi-robot simulation. OCEANS 2016 MTS/IEEE Monterey, Monterey, CA, USA. https://doi.org/10.1109/OCEANS.2016.7761080
  • Marchel, Ł., Kot, R., Szymak, P., & Piskur, P. (2025). Model-based AUV path planning using curriculum learning and deep reinforcement learning on a simplified electronic navigation chart. Applied Sciences, 15(11), 6081. https://doi.org/10.3390/app15116081
  • Misko, S., Free, A., Sivashankar, S., Kluge, T., Vantsevich, V., Hirshkorn, M., Morales, A., Brascome, J. M., Rose, S., Bowen, N., Zhang, S., Ghasemi, M., Gardner, S., Fiorini, P., Maddela, M., Jayakumar, P., Gorsich, D., Manning, C., Thurau, M., … Costello, I. (2024). Real-time, closed-loop and physics-based modeling and simulation system for unmanned ground vehicles in unstructured terrain environments. In SAE Technical Paper Series (Vol. 1). National Defense Industrial Association. 2022 NDIA Michigan Chapter Ground Vehicle Systems Engineering and Technology Symposium. https://doi.org/10.4271/2024-01-3957
  • Potokar, E., Ashford, S., Kaess, M., & Mangelson, J. G. (2022). HoloOcean: An underwater robotics simulator. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 3040–3046. Philadelphia, PA, USA.
  • Prats, M., Pérez, J., Fernández, J. J., & Sanz, P. J. (2012). An open source tool for simulation and supervision of underwater intervention missions. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2577–2582. Vilamoura-Algarve, Portugal. https://doi.org/10.1109/IROS.2012.6385788
  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. https://doi.org/10.48550/arXiv.1707.06347
  • Shah, S., Dey, D., Lovett, C., & Kapoor, A. (2017). AirSim: High-fidelity visual and physical simulation for autonomous vehicles. https://doi.org/10.48550/arXiv.1705.05065
  • Song, J., Ma, H., Bagoren, O., Sethuraman, A. V., Zhang, Y., & Skinner, K. A. (2025). OceanSim: A GPU-accelerated underwater robot perception simulation framework. https://doi.org/10.48550/arXiv.2503.01074
  • Su, C., Li, Y., Wang, H., Xia, Y., Li, J., Wan, G., & Chen, Y. (2026). AUV path planning in complex 3D underwater environments based on an improved TD3 algorithm. Ocean Engineering, 345, 123688. https://doi.org/10.1016/j.oceaneng.2025.123688
  • Sørensen, J. V., Ma, Z., & Jørgensen, B. N. (2022). Potentials of game engines for wind power digital twin development: An investigation of the Unreal Engine. Energy Informatics, 5(Suppl 4), 39. https://doi.org/10.1186/s42162-022-00227-2
  • Visai, G. (2024). Cinematic photoreal environments in Unreal Engine 5: Create captivating worlds and unleash the power of cinematic tools without coding. Birmingham, UK: Packt Publishing.
  • Wang, M., Juan, R., Li, Z., & Gao, Z. (2025). Formation control and intention compensating of AUVs using multi-agent reinforcement learning and predict network. Ocean Engineering, 342, 122854. https://doi.org/10.1016/j.oceaneng.2025.122854
  • Wynn, R. B., Huvenne, V. A. I., Le Bas, T. P., Murton, B. J., Connelly, D. P., Bett, B. J., Ruhl, H. A., Morris, K. J., Peakall, J., Parsons, D. R., Sumner, E. J., Darby, S. E., Dorrell, R. M., & Hunt, J. E. (2014). Autonomous underwater vehicles (AUVs): Their past, present and future contributions to the advancement of marine geoscience. Marine Geology, 352, 451–468. https://doi.org/10.1016/j.margeo.2014.03.012
  • Yu, X., Sun, Y., Wang, X., & Zhang, G. (2021). End-to-end AUV motion planning method based on soft actor-critic. Sensors, 21(17), 5893. https://doi.org/10.3390/s21175893
  • Zhang, A., Wang, W., Bi, W., & Huang, Z. (2024). A path planning method based on deep reinforcement learning for AUV in complex marine environment. Ocean Engineering, 313, 119354. https://doi.org/10.1016/j.oceaneng.2024.119354
  • Zhang, B., & Li, C. (2024). The optimization and application research of the RRT-APF-based path planning algorithm. Electronics, 13(24), 4963. https://doi.org/10.3390/electronics13244963
  • Zhang, X., Fan, Y., Liu, H., Zhang, Y., & Sha, Q. (2023). Design and implementation of autonomous underwater vehicle simulation system based on MOOS and Unreal Engine. Electronics, 12(14), 3107. https://doi.org/10.3390/electronics12143107
  • Zhou, Z., Song, J., Xie, X., Shu, Z., Ma, L., Liu, D., Yin, J., & See, S. (2024). Towards building AI-CPS with NVIDIA Isaac Sim: An industrial benchmark and case study for robotics manipulation. Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP ’24), 263–274. Lisbon, Portugal. https://doi.org/10.1145/3639477.3639740
There are 29 citations in total.

Details

Primary Language English
Subjects Autonomous Agents and Multiagent Systems
Journal Section Research Article
Authors

Mahmut Mol 0000-0001-9983-861X

Ahmet Karaarslan 0000-0001-6475-4539

Submission Date October 30, 2025
Acceptance Date March 24, 2026
Publication Date March 31, 2026
DOI https://doi.org/10.54287/gujsa.1813751
IZ https://izlik.org/JA58JP52YD
Published in Issue Year 2026 Volume: 13 Issue: 1

Cite

APA Mol, M., & Karaarslan, A. (2026). Integrating Proximal Policy Optimization with Physically Realistic Simulation for Robust Autonomous Underwater Vehicle Control. Gazi University Journal of Science Part A: Engineering and Innovation, 13(1), 165-199. https://doi.org/10.54287/gujsa.1813751