Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method

Uğur Yıldıran

doi:10.16984/saufenbilder.1286391

Research Article

Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method

Year 2023, Volume: 27 Issue: 6, 1311 - 1321, 18.12.2023

Uğur Yıldıran

https://doi.org/10.16984/saufenbilder.1286391

Abstract

Inverted pendulums constitute one of the popular systems for benchmarking control algorithms. Several methods have been proposed for the control of this system, the majority of which rely on the availability of a mathematical model. However, deriving a mathematical model using physical parameters or system identification techniques requires manual effort. Moreover, the designed controllers may perform poorly if system parameters change. To mitigate these problems, recently, some studies used Reinforcement Learning (RL) based approaches for the control of inverted pendulum systems. Unfortunately, these methods suffer from slow convergence and local minimum problems. Moreover, they may require hyperparameter tuning which complicates the design process significantly. To alleviate these problems, the present study proposes an LQR-based RL method for adaptive balancing control of an inverted pendulum. As shown by numerical experiments, the algorithm stabilizes the system very fast without requiring a mathematical model or extensive hyperparameter tuning. In addition, it can adapt to parametric changes online.

Keywords

Reinforcement learning, LQR, Inverted pendulum, Q-learning, adaptive control

References

[1] O. Boubaker, “The Inverted Pendulum Benchmark in Nonlinear Control Theory: A Survey,” International Journal of Advanced Robotic Systems, vol. 10, no. 5, p. 233, 2013.
[2] A. Jose, C. Augustine, S. M. Malola, K. Chacko, “Performance Study of PID Controller and LQR Technique for Inverted Pendulum,” World Journal of Engineering and Technology, vol. 03, no. 02, 2015.
[3] L. B. Prasad, B. Tyagi, H. O. Gupta, “Optimal Control of Nonlinear Inverted Pendulum System Using PID Controller and LQR: Performance Analysis Without and With Disturbance Input,” International Journal of Automation and Computing, vol. 11, no. 6, pp. 661–670, 2014.
[4] M. K. Habib, S. A. Ayankoso, “Hybrid Control of a Double Linear Inverted Pendulum using LQR-Fuzzy and LQRPID Controllers,” in 2022 IEEE International Conference on Mechatronics and Automation (ICMA), August 2022, pp. 1784–1789.
[5] S. Coşkun, “Non-linear Control of Inverted Pendulum,” Çukurova University Journal of the Faculty of Engineering and Architecture, vol. 35, no. 1, 2020.
[6] J. Yi, N. Yubazaki, K. Hirota, “Upswing and stabilization control of inverted pendulum system based on the SIRMs dynamically connected fuzzy inference model,” Fuzzy Sets and Systems, vol. 122, no. 1, pp. 139–152, 2001.
[7] A. Mills, A. Wills, B. Ninness, “Nonlinear model predictive control of an inverted pendulum,” in 2009 American Control Conference, June 2009, pp. 2335–2340.
[8] B. Liu, J. Hong, L. Wang, “Linear inverted pendulum control based on improved ADRC,” Systems Science & Control Engineering, vol. 7, no. 3, pp. 1–12, 2019.
[9] A. Tiga, C. Ghorbel, N. Benhadj Braiek, “Nonlinear/Linear Switched Control of Inverted Pendulum System: Stability Analysis and Real-Time Implementation,” Mathematical Problems in Engineering, vol. 2019, p. e2391587, 2019.
[10] N. P. K. Reddy, D. M. S. Kumar, D. S. Rao, “Control of Nonlinear Inverted Pendulum System using PID and Fast Output Sampling Based Discrete Sliding Mode Controller,” International Journal of Engineering Research, vol. 3, no. 10, 2014.
[11] A. Bonarini, C. Caccia, A. Lazaric, M. Restelli, “Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot,” in Artificial Intelligence in Theory and Practice II, M. Bramer, Ed., in IFIP – The International Federation for Information Processing. Boston, MA: Springer US, 2008, pp. 151–160.
[12] S. Nagendra, N. Podila, R. Ugarakhod, K. George, “Comparison of reinforcement learning algorithms applied to the cart-pole problem,” in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Sep. 2017, pp. 26–32.
[13] T. Peng, H. Peng, F. Liu, “Guided Deep Reinforcement Learning based on RBFARX Pseudo LQR in Single Stage Inverted Pendulum,” in 2022 International Conference on Intelligent Systems and Computational
[14] D. Bates, “A Hybrid Approach for Reinforcement Learning Using Virtual Policy Gradient for Balancing an Inverted Pendulum.” arXiv, Feb. 06, 2021. Accessed: Mar. 21, 2023. [Online]. Available: http://arxiv.org/abs/2102.08362
[15] A. Surriani, O. Wahyunggoro, A. I. Cahyadi, “Reinforcement Learning for Cart Pole Inverted Pendulum System,” in 2021 IEEE Industrial Electronics and Applications Conference (IEACon), Nov. 2021, pp. 297–301.
[16] C. A. Manrique Escobar, C. M. Pappalardo, D. Guida, “A Parametric Study of a Deep Reinforcement Learning Control System Applied to the Swing-Up Problem of the Cart-Pole,” Applied Sciences, vol. 10, no. 24, Art. no. 24, 2020.
[17] B. Kiumarsi, K. G. Vamvoudakis, H. Modares, F. L. Lewis, “Optimal and Autonomous Control Using Reinforcement Learning: A Survey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2042–2062, 2018.
[18] S. Bradtke, “Reinforcement Learning Applied to Linear Quadratic Regulation,” in Advances in Neural Information Processing Systems, Morgan-Kaufmann, 1992. Accessed: Mar. 08, 2023. [Online]. Available: https://proceedings.neurips.cc/paper/19 92/hash/19bc916108fc6938f52cb96f7e 087941-Abstract.html
[19] V. G. Lopez, M. Alsalti, M. A. Müller, “Efficient Off-Policy Q-Learning for Data-Based Discrete-Time LQR Problems,” IEEE Transactions on Automatic Control, pp. 1–12, 2023.
[20] H. Zhang, N. Li, “Data-driven policy iteration algorithm for continuous-time stochastic linear-quadratic optimal control problems.” arXiv, Sep. 28, 2022. Accessed: Mar. 08, 2023. [Online]. Available: http://arxiv.org/abs/2209.14490
[21] Y. Hu, A. Wierman, G. Qu, “On the Sample Complexity of Stabilizing LTI Systems on a Single Trajectory.” arXiv, Feb. 14, 2022. Accessed: Mar. 08, 2023. [Online]. Available: http://arxiv.org/abs/2202.07187
[22] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. Third edition, John Wiley & Sons, 2012.
[23] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, Second edition. Cambridge, Mass: A Bradford Book, 1998.
[24] C. De Persis, P. Tesi, “Formulas for Data-Driven Control: Stabilization, Optimality, and Robustness,” IEEE Transactions on Automatic Control, vol. 65, no. 3, pp. 909–924, Mar. 2020.

Year 2023, Volume: 27 Issue: 6, 1311 - 1321, 18.12.2023

Uğur Yıldıran

https://doi.org/10.16984/saufenbilder.1286391

Abstract

References

[1] O. Boubaker, “The Inverted Pendulum Benchmark in Nonlinear Control Theory: A Survey,” International Journal of Advanced Robotic Systems, vol. 10, no. 5, p. 233, 2013.
[2] A. Jose, C. Augustine, S. M. Malola, K. Chacko, “Performance Study of PID Controller and LQR Technique for Inverted Pendulum,” World Journal of Engineering and Technology, vol. 03, no. 02, 2015.
[3] L. B. Prasad, B. Tyagi, H. O. Gupta, “Optimal Control of Nonlinear Inverted Pendulum System Using PID Controller and LQR: Performance Analysis Without and With Disturbance Input,” International Journal of Automation and Computing, vol. 11, no. 6, pp. 661–670, 2014.
[4] M. K. Habib, S. A. Ayankoso, “Hybrid Control of a Double Linear Inverted Pendulum using LQR-Fuzzy and LQRPID Controllers,” in 2022 IEEE International Conference on Mechatronics and Automation (ICMA), August 2022, pp. 1784–1789.
[5] S. Coşkun, “Non-linear Control of Inverted Pendulum,” Çukurova University Journal of the Faculty of Engineering and Architecture, vol. 35, no. 1, 2020.
[6] J. Yi, N. Yubazaki, K. Hirota, “Upswing and stabilization control of inverted pendulum system based on the SIRMs dynamically connected fuzzy inference model,” Fuzzy Sets and Systems, vol. 122, no. 1, pp. 139–152, 2001.
[7] A. Mills, A. Wills, B. Ninness, “Nonlinear model predictive control of an inverted pendulum,” in 2009 American Control Conference, June 2009, pp. 2335–2340.
[8] B. Liu, J. Hong, L. Wang, “Linear inverted pendulum control based on improved ADRC,” Systems Science & Control Engineering, vol. 7, no. 3, pp. 1–12, 2019.
[9] A. Tiga, C. Ghorbel, N. Benhadj Braiek, “Nonlinear/Linear Switched Control of Inverted Pendulum System: Stability Analysis and Real-Time Implementation,” Mathematical Problems in Engineering, vol. 2019, p. e2391587, 2019.
[10] N. P. K. Reddy, D. M. S. Kumar, D. S. Rao, “Control of Nonlinear Inverted Pendulum System using PID and Fast Output Sampling Based Discrete Sliding Mode Controller,” International Journal of Engineering Research, vol. 3, no. 10, 2014.
[11] A. Bonarini, C. Caccia, A. Lazaric, M. Restelli, “Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot,” in Artificial Intelligence in Theory and Practice II, M. Bramer, Ed., in IFIP – The International Federation for Information Processing. Boston, MA: Springer US, 2008, pp. 151–160.
[12] S. Nagendra, N. Podila, R. Ugarakhod, K. George, “Comparison of reinforcement learning algorithms applied to the cart-pole problem,” in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Sep. 2017, pp. 26–32.
[13] T. Peng, H. Peng, F. Liu, “Guided Deep Reinforcement Learning based on RBFARX Pseudo LQR in Single Stage Inverted Pendulum,” in 2022 International Conference on Intelligent Systems and Computational
[14] D. Bates, “A Hybrid Approach for Reinforcement Learning Using Virtual Policy Gradient for Balancing an Inverted Pendulum.” arXiv, Feb. 06, 2021. Accessed: Mar. 21, 2023. [Online]. Available: http://arxiv.org/abs/2102.08362
[15] A. Surriani, O. Wahyunggoro, A. I. Cahyadi, “Reinforcement Learning for Cart Pole Inverted Pendulum System,” in 2021 IEEE Industrial Electronics and Applications Conference (IEACon), Nov. 2021, pp. 297–301.
[16] C. A. Manrique Escobar, C. M. Pappalardo, D. Guida, “A Parametric Study of a Deep Reinforcement Learning Control System Applied to the Swing-Up Problem of the Cart-Pole,” Applied Sciences, vol. 10, no. 24, Art. no. 24, 2020.
[17] B. Kiumarsi, K. G. Vamvoudakis, H. Modares, F. L. Lewis, “Optimal and Autonomous Control Using Reinforcement Learning: A Survey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2042–2062, 2018.
[18] S. Bradtke, “Reinforcement Learning Applied to Linear Quadratic Regulation,” in Advances in Neural Information Processing Systems, Morgan-Kaufmann, 1992. Accessed: Mar. 08, 2023. [Online]. Available: https://proceedings.neurips.cc/paper/19 92/hash/19bc916108fc6938f52cb96f7e 087941-Abstract.html
[19] V. G. Lopez, M. Alsalti, M. A. Müller, “Efficient Off-Policy Q-Learning for Data-Based Discrete-Time LQR Problems,” IEEE Transactions on Automatic Control, pp. 1–12, 2023.
[20] H. Zhang, N. Li, “Data-driven policy iteration algorithm for continuous-time stochastic linear-quadratic optimal control problems.” arXiv, Sep. 28, 2022. Accessed: Mar. 08, 2023. [Online]. Available: http://arxiv.org/abs/2209.14490
[21] Y. Hu, A. Wierman, G. Qu, “On the Sample Complexity of Stabilizing LTI Systems on a Single Trajectory.” arXiv, Feb. 14, 2022. Accessed: Mar. 08, 2023. [Online]. Available: http://arxiv.org/abs/2202.07187
[22] F. L. Lewis, D. Vrabie, and V. L. Syrmos, Optimal Control. Third edition, John Wiley & Sons, 2012.
[23] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, Second edition. Cambridge, Mass: A Bradford Book, 1998.
[24] C. De Persis, P. Tesi, “Formulas for Data-Driven Control: Stabilization, Optimality, and Robustness,” IEEE Transactions on Automatic Control, vol. 65, no. 3, pp. 909–924, Mar. 2020.

There are 24 citations in total.

Details

Primary Language	English
Subjects	Electrical Engineering
Journal Section	Research Articles
Authors	Uğur Yıldıran 0000-0002-8220-8723
Early Pub Date	December 1, 2023
Publication Date	December 18, 2023
Submission Date	April 21, 2023
Acceptance Date	September 26, 2023
Published in Issue	Year 2023 Volume: 27 Issue: 6

Cite

APA	Yıldıran, U. (2023). Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method. Sakarya University Journal of Science, 27(6), 1311-1321. https://doi.org/10.16984/saufenbilder.1286391
AMA	Yıldıran U. Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method. SAUJS. December 2023;27(6):1311-1321. doi:10.16984/saufenbilder.1286391
Chicago	Yıldıran, Uğur. “Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method”. Sakarya University Journal of Science 27, no. 6 (December 2023): 1311-21. https://doi.org/10.16984/saufenbilder.1286391.
EndNote	Yıldıran U (December 1, 2023) Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method. Sakarya University Journal of Science 27 6 1311–1321.
IEEE	U. Yıldıran, “Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method”, SAUJS, vol. 27, no. 6, pp. 1311–1321, 2023, doi: 10.16984/saufenbilder.1286391.
ISNAD	Yıldıran, Uğur. “Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method”. Sakarya University Journal of Science 27/6 (December 2023), 1311-1321. https://doi.org/10.16984/saufenbilder.1286391.
JAMA	Yıldıran U. Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method. SAUJS. 2023;27:1311–1321.
MLA	Yıldıran, Uğur. “Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method”. Sakarya University Journal of Science, vol. 27, no. 6, 2023, pp. 1311-2, doi:10.16984/saufenbilder.1286391.
Vancouver	Yıldıran U. Adaptive Control of an Inverted Pendulum by a Reinforcement Learningbased LQR Method. SAUJS. 2023;27(6):1311-2.

Download Cover Image

Article Files

Full Text

INDEXING & ABSTRACTING & ARCHIVING

30930 Bu eser Creative Commons Atıf-Ticari Olmayan 4.0 Uluslararası Lisans kapsamında lisanslanmıştır .