Research Article
BibTex RIS Cite

Real-Time Application of DC Machine Speed Control with Q-Learning Based PID Controller

Year 2023, , 669 - 681, 30.04.2023
https://doi.org/10.29130/dubited.1111267

Abstract

In this study, the Q-learning based adaptive PID controller's performance has been examined on a real-time system. DC machine speed control system selected as a real-time system. The system’s state and reward signal are calculated by using the error signal of the DC machine speed control system. With the help of state and reward signals, the algorithm adjusts PID parameters to find the optimal solution. One Q-table is defined for each PID parameter. Controller performance was examined with a simulation study and real-time application. It has been determined that the controller designed with reinforcement learning is successful like the classical PID structure.

References

  • [1]R. S. Sutton and A. G. Barto, “An introduction to reinforcement learning,” Decis. Theory Model. Appl. Artif. Intell. Concepts Solut., pp. 63–80, 2011.
  • [2]M. L. Minsky, “Theory Of Neural-Analog Reinforcement Systems and Its Application To The Brain-Model Problem,” Princeton University, Princeton, 1954.
  • [3]D. P. Bertsekas, Dynamic Programming and Stochastic Control. New York-London: Academic Press, 1976.
  • [4]R. Bellman, “The Theory of Dynamic Programming,” Bull. Am. Math. Soc., vol. 60, no. 6, pp. 503–515, 1954,.
  • [5]R. Bellman, “Dynamic programming and stochastic control processes,” Inf. Control, vol. 1, no. 3, pp. 228–239, Sep. 1958.
  • [6]C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn. 1992 83, vol. 8, no. 3, pp. 279–292, May 1992.
  • [7]V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
  • [8]Q. Shi, H. K. Lam, B. Xiao, and S. H. Tsai, “Adaptive PID controller based on Q-learning algorithm,” CAAI Trans. Intell. Technol., vol. 3, no. 4, pp. 235–244, 2018.
  • [9]F. L. Lewis and D. Vrabie, “Adaptive dynamic programming for feedback control,” Proc. 2009 7th Asian Control Conf. ASCC 2009, pp. 1402–1409, 2009.
  • [10]B. P. Amiruddin and R. E. A. Kadir, “Ball and beam control using adaptive pid based on q-learning,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2020-Octob, no. October, pp. 203–208, 2020.
  • [11]M. Ali, A. Mujeeb, H. Ullah, and S. Zeb, “Reactive Power Optimization Using Feed Forward Neural Deep Reinforcement Learning Method : (Deep Reinforcement Learning DQN algorithm),” 2020 Asia Energy Electr. Eng. Symp. AEEES 2020, pp. 497–501, May 2020.
  • [12]T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, “Cooperative deep reinforcement learning for large-scale traffic grid signal control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2687–2700, Jun. 2020.
  • [13]Z. Guan and T. Yamamoto, “Design of a reinforcement learning PID controller,” IEEJ Trans. Electr. Electron. Eng., 2021.
  • [14]I. Carlucho, M. De Paula, S. A. Villar, and G. G. Acosta, “Incremental Q-learning strategy for adaptive PID control of mobile robots,” Expert Syst. Appl., vol. 80, pp. 183–199, Sep. 2017.
  • [15]I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, Jul. 2020.
  • [16]M. Gheisarnejad and M. H. Khooban, “An Intelligent Non-Integer PID Controller-Based Deep Reinforcement Learning: Implementation and Experimental Results,” IEEE Trans. Ind. Electron., vol. 68, no. 4, pp. 3609–3618, Apr. 2021.
  • [17]D. Lee, S. J. Lee, and S. C. Yim, “Reinforcement learning-based adaptive PID controller for DPS,” Ocean Eng., vol. 216, Nov. 2020.
  • [18]X. song WANG, Y. hu CHENG, and W. SUN, “A Proposal of Adaptive PID Controller Based on Reinforcement Learning,” J. China Univ. Min. Technol., vol. 17, no. 1, pp. 40–44, Mar. 2007.
  • [19]M. Ağralı, M. U. Soydemir, A. Gökçen, and S. Şahin, “Deep Reinforcement Learning Based Controller Design for Model of The Vertical Take-off and Landing System,” Eur. J. Sci. Technol. Spec. Issue, vol. 26, no. 26, pp. 358–363, 2021.
  • [20]A. Younesi and H. Shayeghi, “Q-Learning Based Supervisory PID Controller for Damping Frequency Oscillations in a Hybrid Mini/Micro-Grid,” Iran. J. Electr. Electron. Eng., vol. 15, no. 1, pp. 126–141, Mar. 2019.
  • [21]C. Mu, K. Wang, S. Ma, Z. Chong, and Z. Ni, “Adaptive composite frequency control of power systems using reinforcement learning,” CAAI Trans. Intell. Technol., May 2022.
  • [22] J. Khalid, M. A. M. Ramli, M. S. Khan, and T. Hidayat, “Efficient Load Frequency Control of Renewable Integrated Power System: A Twin Delayed DDPG-Based Deep Reinforcement Learning Approach,” IEEE Access, vol. 10, pp. 51561–51574, 2022.
  • [23]J. C. Hung and J. D. Hewlett, “PID control,” Control Mechatronics, vol. 9, pp. 10.1-10.8, 2016.
  • [24]P. Lu, W. Huang, and J. Xiao, “Speed tracking of Brushless DC motor based on deep reinforcement learning and PID,” 2021 7th Int. Conf. Cond. Monit. Mach. Non-Stationary Oper. C. 2021, pp. 130–134, Jun. 2021.
  • [25]X. Y. Shang, T. Y. Ji, M. S. Li, P. Z. Wu, and Q. H. Wu, “Parameter optimization of PID controllers by reinforcement learning,” 2013 5th Comput. Sci. Electron. Eng. Conf. CEEC 2013 - Conf. Proc., pp. 77–81, 2013.
  • [26]R. Mukhopadhyay, S. Bandyopadhyay, A. Sutradhar, and P. Chattopadhyay, “Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers,” 2019 IEEE Bombay Sect. Signat. Conf. IBSSC 2019, vol. 2019Januar, pp. 1–6, 2019.
  • [27]W. Yu and A. Perrusquía, Human‐Robot Interaction Control Using Reinforcement Learning. 2021.
  • [28]C. J. C. H. Watkins, “Learning from delayed rewards. PhD thesis,” PhD thesis. 1989.

DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması

Year 2023, , 669 - 681, 30.04.2023
https://doi.org/10.29130/dubited.1111267

Abstract

Çalışmamızda Q-öğrenme tabanlı adaptif PID kontrolörün gerçek zamanlı bir sistemdeki performansı incelenmiştir. Gerçek zamanlı sistem olarak DA makine hız kontrolü sistemi tercih edilmiştir. DA makine sisteminden gelen hata sinyali üzerinden sistemin durum bilgisi ve Q-öğrenme yöntemi için ödül sinyali hesaplanmaktadır. Durum bilgisi ve ödül sinyali yardımı ile PID katsayıları artırılıp azaltılarak optimal katsayılara ulaşılmaktadır. Her PID katsayısı için bir adet Q-tablosu tanımlanmıştır. Simülasyon çalışması ve gerçek zamanlı uygulama ile kontrolör performansı incelenmiştir. Pekiştirmeli öğrenme ile tasarlanan kontrolcünün klasik PID yapısı gibi başarılı olduğu tespit edilmiştir.

References

  • [1]R. S. Sutton and A. G. Barto, “An introduction to reinforcement learning,” Decis. Theory Model. Appl. Artif. Intell. Concepts Solut., pp. 63–80, 2011.
  • [2]M. L. Minsky, “Theory Of Neural-Analog Reinforcement Systems and Its Application To The Brain-Model Problem,” Princeton University, Princeton, 1954.
  • [3]D. P. Bertsekas, Dynamic Programming and Stochastic Control. New York-London: Academic Press, 1976.
  • [4]R. Bellman, “The Theory of Dynamic Programming,” Bull. Am. Math. Soc., vol. 60, no. 6, pp. 503–515, 1954,.
  • [5]R. Bellman, “Dynamic programming and stochastic control processes,” Inf. Control, vol. 1, no. 3, pp. 228–239, Sep. 1958.
  • [6]C. J. C. H. Watkins and P. Dayan, “Q-learning,” Mach. Learn. 1992 83, vol. 8, no. 3, pp. 279–292, May 1992.
  • [7]V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
  • [8]Q. Shi, H. K. Lam, B. Xiao, and S. H. Tsai, “Adaptive PID controller based on Q-learning algorithm,” CAAI Trans. Intell. Technol., vol. 3, no. 4, pp. 235–244, 2018.
  • [9]F. L. Lewis and D. Vrabie, “Adaptive dynamic programming for feedback control,” Proc. 2009 7th Asian Control Conf. ASCC 2009, pp. 1402–1409, 2009.
  • [10]B. P. Amiruddin and R. E. A. Kadir, “Ball and beam control using adaptive pid based on q-learning,” Int. Conf. Electr. Eng. Comput. Sci. Informatics, vol. 2020-Octob, no. October, pp. 203–208, 2020.
  • [11]M. Ali, A. Mujeeb, H. Ullah, and S. Zeb, “Reactive Power Optimization Using Feed Forward Neural Deep Reinforcement Learning Method : (Deep Reinforcement Learning DQN algorithm),” 2020 Asia Energy Electr. Eng. Symp. AEEES 2020, pp. 497–501, May 2020.
  • [12]T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, “Cooperative deep reinforcement learning for large-scale traffic grid signal control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2687–2700, Jun. 2020.
  • [13]Z. Guan and T. Yamamoto, “Design of a reinforcement learning PID controller,” IEEJ Trans. Electr. Electron. Eng., 2021.
  • [14]I. Carlucho, M. De Paula, S. A. Villar, and G. G. Acosta, “Incremental Q-learning strategy for adaptive PID control of mobile robots,” Expert Syst. Appl., vol. 80, pp. 183–199, Sep. 2017.
  • [15]I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots,” ISA Trans., vol. 102, pp. 280–294, Jul. 2020.
  • [16]M. Gheisarnejad and M. H. Khooban, “An Intelligent Non-Integer PID Controller-Based Deep Reinforcement Learning: Implementation and Experimental Results,” IEEE Trans. Ind. Electron., vol. 68, no. 4, pp. 3609–3618, Apr. 2021.
  • [17]D. Lee, S. J. Lee, and S. C. Yim, “Reinforcement learning-based adaptive PID controller for DPS,” Ocean Eng., vol. 216, Nov. 2020.
  • [18]X. song WANG, Y. hu CHENG, and W. SUN, “A Proposal of Adaptive PID Controller Based on Reinforcement Learning,” J. China Univ. Min. Technol., vol. 17, no. 1, pp. 40–44, Mar. 2007.
  • [19]M. Ağralı, M. U. Soydemir, A. Gökçen, and S. Şahin, “Deep Reinforcement Learning Based Controller Design for Model of The Vertical Take-off and Landing System,” Eur. J. Sci. Technol. Spec. Issue, vol. 26, no. 26, pp. 358–363, 2021.
  • [20]A. Younesi and H. Shayeghi, “Q-Learning Based Supervisory PID Controller for Damping Frequency Oscillations in a Hybrid Mini/Micro-Grid,” Iran. J. Electr. Electron. Eng., vol. 15, no. 1, pp. 126–141, Mar. 2019.
  • [21]C. Mu, K. Wang, S. Ma, Z. Chong, and Z. Ni, “Adaptive composite frequency control of power systems using reinforcement learning,” CAAI Trans. Intell. Technol., May 2022.
  • [22] J. Khalid, M. A. M. Ramli, M. S. Khan, and T. Hidayat, “Efficient Load Frequency Control of Renewable Integrated Power System: A Twin Delayed DDPG-Based Deep Reinforcement Learning Approach,” IEEE Access, vol. 10, pp. 51561–51574, 2022.
  • [23]J. C. Hung and J. D. Hewlett, “PID control,” Control Mechatronics, vol. 9, pp. 10.1-10.8, 2016.
  • [24]P. Lu, W. Huang, and J. Xiao, “Speed tracking of Brushless DC motor based on deep reinforcement learning and PID,” 2021 7th Int. Conf. Cond. Monit. Mach. Non-Stationary Oper. C. 2021, pp. 130–134, Jun. 2021.
  • [25]X. Y. Shang, T. Y. Ji, M. S. Li, P. Z. Wu, and Q. H. Wu, “Parameter optimization of PID controllers by reinforcement learning,” 2013 5th Comput. Sci. Electron. Eng. Conf. CEEC 2013 - Conf. Proc., pp. 77–81, 2013.
  • [26]R. Mukhopadhyay, S. Bandyopadhyay, A. Sutradhar, and P. Chattopadhyay, “Performance Analysis of Deep Q Networks and Advantage Actor Critic Algorithms in Designing Reinforcement Learning-based Self-tuning PID Controllers,” 2019 IEEE Bombay Sect. Signat. Conf. IBSSC 2019, vol. 2019Januar, pp. 1–6, 2019.
  • [27]W. Yu and A. Perrusquía, Human‐Robot Interaction Control Using Reinforcement Learning. 2021.
  • [28]C. J. C. H. Watkins, “Learning from delayed rewards. PhD thesis,” PhD thesis. 1989.
There are 28 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Bekir Murat Aydın 0000-0002-5965-0687

Burhan Baraklı 0000-0002-7947-2312

Publication Date April 30, 2023
Published in Issue Year 2023

Cite

APA Aydın, B. M., & Baraklı, B. (2023). DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması. Duzce University Journal of Science and Technology, 11(2), 669-681. https://doi.org/10.29130/dubited.1111267
AMA Aydın BM, Baraklı B. DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması. DÜBİTED. April 2023;11(2):669-681. doi:10.29130/dubited.1111267
Chicago Aydın, Bekir Murat, and Burhan Baraklı. “DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör Ile Gerçek-Zamanlı Uygulaması”. Duzce University Journal of Science and Technology 11, no. 2 (April 2023): 669-81. https://doi.org/10.29130/dubited.1111267.
EndNote Aydın BM, Baraklı B (April 1, 2023) DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması. Duzce University Journal of Science and Technology 11 2 669–681.
IEEE B. M. Aydın and B. Baraklı, “DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması”, DÜBİTED, vol. 11, no. 2, pp. 669–681, 2023, doi: 10.29130/dubited.1111267.
ISNAD Aydın, Bekir Murat - Baraklı, Burhan. “DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör Ile Gerçek-Zamanlı Uygulaması”. Duzce University Journal of Science and Technology 11/2 (April 2023), 669-681. https://doi.org/10.29130/dubited.1111267.
JAMA Aydın BM, Baraklı B. DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması. DÜBİTED. 2023;11:669–681.
MLA Aydın, Bekir Murat and Burhan Baraklı. “DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör Ile Gerçek-Zamanlı Uygulaması”. Duzce University Journal of Science and Technology, vol. 11, no. 2, 2023, pp. 669-81, doi:10.29130/dubited.1111267.
Vancouver Aydın BM, Baraklı B. DA Makinesi Hız Kontrolünün Q-Öğrenme Tabanlı PID Kontrolör ile Gerçek-Zamanlı Uygulaması. DÜBİTED. 2023;11(2):669-81.