Real-Time Application of DC Machine Speed Control with Q-Learning Based PID Controller
Year 2023,
, 669 - 681, 30.04.2023
Bekir Murat Aydın
,
Burhan Baraklı
Abstract
In this study, the Q-learning based adaptive PID controller's performance has been examined on a real-time system. DC machine speed control system selected as a real-time system. The system’s state and reward signal are calculated by using the error signal of the DC machine speed control system. With the help of state and reward signals, the algorithm adjusts PID parameters to find the optimal solution. One Q-table is defined for each PID parameter. Controller performance was examined with a simulation study and real-time application. It has been determined that the controller designed with reinforcement learning is successful like the classical PID structure.
References
- [1]R. S. Sutton and A. G. Barto, “An introduction to reinforcement learning,” Decis. Theory Model. Appl. Artif. Intell. Concepts Solut., pp. 63–80, 2011.
- [2]M. L. Minsky, “Theory Of Neural-Analog Reinforcement Systems and Its Application To The Brain-Model Problem,” Princeton University, Princeton, 1954.
- [3]D. P. Bertsekas, Dynamic Programming and Stochastic Control. New York-London: Academic Press, 1976.
- [4]R. Bellman, “The Theory of Dynamic Programming,” Bull. Am. Math. Soc., vol. 60, no. 6, pp. 503–515, 1954,.
- [5]R. Bellman, “Dynamic programming and stochastic control processes,” Inf. Control, vol. 1, no. 3, pp. 228–239, Sep. 1958.
- [6]C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn. 1992 83, vol. 8, no. 3, pp. 279–292, May 1992.
- [7]V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
- [8]Q. Shi, H. K. Lam, B. Xiao, and S. H. Tsai, “Adaptive PID controller based on Q-learning algorithm,” CAAI Trans. Intell. Technol., vol. 3, no. 4, pp. 235–244, 2018.
- [9]F. L. Lewis and D. Vrabie, “Adaptive dynamic programming for feedback control,” Proc. 2009 7th Asian Control Conf. ASCC 2009, pp. 1402–1409, 2009.
- [10]B. P. Amiruddin and R. E. A. Kadir, “Ball and beam control using adaptive pid based on q-learning,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2020-Octob, no. October, pp. 203–208, 2020.
- [11]M. Ali, A. Mujeeb, H. Ullah, and S. Zeb, “Reactive Power Optimization Using Feed Forward Neural Deep Reinforcement Learning Method : (Deep Reinforcement Learning DQN algorithm),” 2020 Asia Energy Electr. Eng. Symp. AEEES 2020, pp. 497–501, May 2020.
- [12]T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, “Cooperative deep reinforcement learning for large-scale traffic grid signal control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2687–2700, Jun. 2020.
- [13]Z. Guan and T. Yamamoto, “Design of a reinforcement learning PID controller,” IEEJ Trans. Electr. Electron. Eng., 2021.
- [14]I. Carlucho, M. De Paula, S. A. Villar, and G. G. Acosta, “Incremental Q-learning strategy for adaptive PID control of mobile robots,” Expert Syst. Appl., vol. 80, pp. 183–199, Sep. 2017.
- [15]I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, Jul. 2020.
- [16]M. Gheisarnejad and M. H. Khooban, “An Intelligent Non-Integer PID Controller-Based Deep Reinforcement Learning: Implementation and Experimental Results,” IEEE Trans. Ind. Electron., vol. 68, no. 4, pp. 3609–3618, Apr. 2021.
- [17]D. Lee, S. J. Lee, and S. C. Yim, “Reinforcement learning-based adaptive PID controller for DPS,” Ocean Eng., vol. 216, Nov. 2020.
- [18]X. song WANG, Y. hu CHENG, and W. SUN, “A Proposal of Adaptive PID Controller Based on Reinforcement Learning,” J. China Univ. Min. Technol., vol. 17, no. 1, pp. 40–44, Mar. 2007.
- [19]M. Ağralı, M. U. Soydemir, A. Gökçen, and S. Şahin, “Deep Reinforcement Learning Based Controller Design for Model of The Vertical Take-off and Landing System,” Eur. J. Sci. Technol. Spec. Issue, vol. 26, no. 26, pp. 358–363, 2021.
- [20]A. Younesi and H. Shayeghi, “Q-Learning Based Supervisory PID Controller for Damping Frequency Oscillations in a Hybrid Mini/Micro-Grid,” Iran. J. Electr. Electron. Eng., vol. 15, no. 1, pp. 126–141, Mar. 2019.
- [21]C. Mu, K. Wang, S. Ma, Z. Chong, and Z. Ni, “Adaptive composite frequency control of power systems using reinforcement learning,” CAAI Trans. Intell. Technol., May 2022.
- [22] J. Khalid, M. A. M. Ramli, M. S. Khan, and T. Hidayat, “Efficient Load Frequency Control of Renewable Integrated Power System: A Twin Delayed DDPG-Based Deep Reinforcement Learning Approach,” IEEE Access, vol. 10, pp. 51561–51574, 2022.
- [23]J. C. Hung and J. D. Hewlett, “PID control,” Control Mechatronics, vol. 9, pp. 10.1-10.8, 2016.
- [24]P. Lu, W. Huang, and J. Xiao, “Speed tracking of Brushless DC motor based on deep reinforcement learning and PID,” 2021 7th Int. Conf. Cond. Monit. Mach. Non-Stationary Oper. C. 2021, pp. 130–134, Jun. 2021.
- [25]X. Y. Shang, T. Y. Ji, M. S. Li, P. Z. Wu, and Q. H. Wu, “Parameter optimization of PID controllers by reinforcement learning,” 2013 5th Comput. Sci. Electron. Eng. Conf. CEEC 2013 - Conf. Proc., pp. 77–81, 2013.
- [26]R. Mukhopadhyay, S. Bandyopadhyay, A. Sutradhar, and P. Chattopadhyay, “Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers,” 2019 IEEE Bombay Sect. Signat. Conf. IBSSC 2019, vol. 2019Januar, pp. 1–6, 2019.
- [27]W. Yu and A. Perrusquía, Human‐Robot Interaction Control Using Reinforcement Learning. 2021.
- [28]C. J. C. H. Watkins, “Learning from delayed rewards. PhD thesis,” PhD thesis. 1989.
DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması
Year 2023,
, 669 - 681, 30.04.2023
Bekir Murat Aydın
,
Burhan Baraklı
Abstract
Çalışmamızda Q-öğrenme tabanlı adaptif PID kontrolörün gerçek zamanlı bir sistemdeki performansı incelenmiştir. Gerçek zamanlı sistem olarak DA makine hız kontrolü sistemi tercih edilmiştir. DA makine sisteminden gelen hata sinyali üzerinden sistemin durum bilgisi ve Q-öğrenme yöntemi için ödül sinyali hesaplanmaktadır. Durum bilgisi ve ödül sinyali yardımı ile PID katsayıları artırılıp azaltılarak optimal katsayılara ulaşılmaktadır. Her PID katsayısı için bir adet Q-tablosu tanımlanmıştır. Simülasyon çalışması ve gerçek zamanlı uygulama ile kontrolör performansı incelenmiştir. Pekiştirmeli öğrenme ile tasarlanan kontrolcünün klasik PID yapısı gibi başarılı olduğu tespit edilmiştir.
References
- [1]R. S. Sutton and A. G. Barto, “An introduction to reinforcement learning,” Decis. Theory Model. Appl. Artif. Intell. Concepts Solut., pp. 63–80, 2011.
- [2]M. L. Minsky, “Theory Of Neural-Analog Reinforcement Systems and Its Application To The Brain-Model Problem,” Princeton University, Princeton, 1954.
- [3]D. P. Bertsekas, Dynamic Programming and Stochastic Control. New York-London: Academic Press, 1976.
- [4]R. Bellman, “The Theory of Dynamic Programming,” Bull. Am. Math. Soc., vol. 60, no. 6, pp. 503–515, 1954,.
- [5]R. Bellman, “Dynamic programming and stochastic control processes,” Inf. Control, vol. 1, no. 3, pp. 228–239, Sep. 1958.
- [6]C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn. 1992 83, vol. 8, no. 3, pp. 279–292, May 1992.
- [7]V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
- [8]Q. Shi, H. K. Lam, B. Xiao, and S. H. Tsai, “Adaptive PID controller based on Q-learning algorithm,” CAAI Trans. Intell. Technol., vol. 3, no. 4, pp. 235–244, 2018.
- [9]F. L. Lewis and D. Vrabie, “Adaptive dynamic programming for feedback control,” Proc. 2009 7th Asian Control Conf. ASCC 2009, pp. 1402–1409, 2009.
- [10]B. P. Amiruddin and R. E. A. Kadir, “Ball and beam control using adaptive pid based on q-learning,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2020-Octob, no. October, pp. 203–208, 2020.
- [11]M. Ali, A. Mujeeb, H. Ullah, and S. Zeb, “Reactive Power Optimization Using Feed Forward Neural Deep Reinforcement Learning Method : (Deep Reinforcement Learning DQN algorithm),” 2020 Asia Energy Electr. Eng. Symp. AEEES 2020, pp. 497–501, May 2020.
- [12]T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, “Cooperative deep reinforcement learning for large-scale traffic grid signal control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2687–2700, Jun. 2020.
- [13]Z. Guan and T. Yamamoto, “Design of a reinforcement learning PID controller,” IEEJ Trans. Electr. Electron. Eng., 2021.
- [14]I. Carlucho, M. De Paula, S. A. Villar, and G. G. Acosta, “Incremental Q-learning strategy for adaptive PID control of mobile robots,” Expert Syst. Appl., vol. 80, pp. 183–199, Sep. 2017.
- [15]I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, Jul. 2020.
- [16]M. Gheisarnejad and M. H. Khooban, “An Intelligent Non-Integer PID Controller-Based Deep Reinforcement Learning: Implementation and Experimental Results,” IEEE Trans. Ind. Electron., vol. 68, no. 4, pp. 3609–3618, Apr. 2021.
- [17]D. Lee, S. J. Lee, and S. C. Yim, “Reinforcement learning-based adaptive PID controller for DPS,” Ocean Eng., vol. 216, Nov. 2020.
- [18]X. song WANG, Y. hu CHENG, and W. SUN, “A Proposal of Adaptive PID Controller Based on Reinforcement Learning,” J. China Univ. Min. Technol., vol. 17, no. 1, pp. 40–44, Mar. 2007.
- [19]M. Ağralı, M. U. Soydemir, A. Gökçen, and S. Şahin, “Deep Reinforcement Learning Based Controller Design for Model of The Vertical Take-off and Landing System,” Eur. J. Sci. Technol. Spec. Issue, vol. 26, no. 26, pp. 358–363, 2021.
- [20]A. Younesi and H. Shayeghi, “Q-Learning Based Supervisory PID Controller for Damping Frequency Oscillations in a Hybrid Mini/Micro-Grid,” Iran. J. Electr. Electron. Eng., vol. 15, no. 1, pp. 126–141, Mar. 2019.
- [21]C. Mu, K. Wang, S. Ma, Z. Chong, and Z. Ni, “Adaptive composite frequency control of power systems using reinforcement learning,” CAAI Trans. Intell. Technol., May 2022.
- [22] J. Khalid, M. A. M. Ramli, M. S. Khan, and T. Hidayat, “Efficient Load Frequency Control of Renewable Integrated Power System: A Twin Delayed DDPG-Based Deep Reinforcement Learning Approach,” IEEE Access, vol. 10, pp. 51561–51574, 2022.
- [23]J. C. Hung and J. D. Hewlett, “PID control,” Control Mechatronics, vol. 9, pp. 10.1-10.8, 2016.
- [24]P. Lu, W. Huang, and J. Xiao, “Speed tracking of Brushless DC motor based on deep reinforcement learning and PID,” 2021 7th Int. Conf. Cond. Monit. Mach. Non-Stationary Oper. C. 2021, pp. 130–134, Jun. 2021.
- [25]X. Y. Shang, T. Y. Ji, M. S. Li, P. Z. Wu, and Q. H. Wu, “Parameter optimization of PID controllers by reinforcement learning,” 2013 5th Comput. Sci. Electron. Eng. Conf. CEEC 2013 - Conf. Proc., pp. 77–81, 2013.
- [26]R. Mukhopadhyay, S. Bandyopadhyay, A. Sutradhar, and P. Chattopadhyay, “Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers,” 2019 IEEE Bombay Sect. Signat. Conf. IBSSC 2019, vol. 2019Januar, pp. 1–6, 2019.
- [27]W. Yu and A. Perrusquía, Human‐Robot Interaction Control Using Reinforcement Learning. 2021.
- [28]C. J. C. H. Watkins, “Learning from delayed rewards. PhD thesis,” PhD thesis. 1989.