<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                    <journal-id></journal-id>
            <journal-title-group>
                                                                                    <journal-title>Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi</journal-title>
            </journal-title-group>
                                        <issn pub-type="epub">2147-5881</issn>
                                                                                            <publisher>
                    <publisher-name>Pamukkale Üniversitesi</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id/>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Optimization Techniques in Mechanical Engineering</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Makine Mühendisliğinde Optimizasyon Teknikleri</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                        <trans-title-group xml:lang="en">
                                    <trans-title>Determining maintenance policies for partially observable multicomponent systems with deep reinforcement learning</trans-title>
                                </trans-title-group>
                                                                                                                                                                                                <article-title>Kısmi gözlemlenebilir çok bileşenli sistemler için bakım politikalarının pekiştirmeli derin öğrenme yöntemleri ile belirlenmesi</article-title>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                <name>
                                    <surname>Karabağ</surname>
                                    <given-names>Oktay</given-names>
                                </name>
                                                                    <aff>İZMİR EKONOMİ ÜNİVERSİTESİ</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20250429">
                    <day>04</day>
                    <month>29</month>
                    <year>2025</year>
                </pub-date>
                                        <volume>31</volume>
                                        <issue>2</issue>
                                        <fpage>166</fpage>
                                        <lpage>179</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20240107">
                        <day>01</day>
                        <month>07</month>
                        <year>2024</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20240630">
                        <day>06</day>
                        <month>30</month>
                        <year>2024</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2013, Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi</copyright-statement>
                    <copyright-year>2013</copyright-year>
                    <copyright-holder>Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi</copyright-holder>
                </permissions>
            
                                                                                                <trans-abstract xml:lang="en">
                            <p>In this study, maintenance decisions for partially observable multicomponent systems are investigated. Such systems typically operate under conditions where the service provider is remote, and the wear levels of system components cannot be fully monitored with sensors’ assistance. Wind turbines provide a good example of these systems. For such systems, besides deciding when the service provider will perform a maintenance intervention, it is also necessary to determine which parts will be taken along to the maintenance point and which components will be replaced after the inspection at the maintenance point. In our study, this complex decision problem is modeled as a partially observable Markov decision process, and related numerical solutions are obtained employing the actor-critic reinforcement learning method. Our numerical studies demonstrate that the policies obtained with the reinforcement learning algorithm outperform several heuristic maintenance policies that are frequently used in practice and wellknown in the relevant literature. In some cases, compared to heuristic policies, these solutions have provided a cost reduction in the range of 10-15% on average. Additionally, it has been observed that the solution obtained with the reinforcement learning algorithm provides more advantages compared to heuristic policies, as the corrective maintenance cost, emergency order cost, and returning cost of excess spare parts increase.</p></trans-abstract>
                                                                                                                                    <abstract><p>Bu çalışmada, kısmi gözlemlenebilir çok bileşenli sistemler için bakım/onarım kararları incelenmiştir. Bu tip sistemler genellikle servis sağlayıcının uzakta olduğu koşullarda işletilmekte ve bileşenlerin aşınma seviyeleri genellikle sensörler yardımı ile tam olarak izlenememektedir. Rüzgâr türbinleri, bu tarz sistemlere birebir uyan bir örnek oluşturmaktadır. İlgili sistemlerde, servis sağlayıcı ne zaman bakım/onarım yapacağına, bakım kararı ile birlikte hangi parçaları bakım noktasına sevk edeceğine ve bakım noktasındaki incelemesinin ardından hangi sistem bileşenlerinin değiştirilmesi gerektiğine karar vermektedir. Çalışmamızda, bahsi geçen bu komplike karar problemi kısmi gözlemlenebilir Markov karar süreci olarak modellenmiş ve ilgili nümerik çözümler aktör kritik pekiştirmeli öğrenme yöntemi kullanılarak elde edilmiştir. Yaptığımız nümerik çalışmalar, pekiştirmeli öğrenme algoritması ile elde edilen çözümlerin pratikte ve literatürde yaygın olarak kullanılan sezgisel bakım/onarım politikalarına kıyasla daha iyi sonuçlar verdiğini göstermiştir. Bazı durumlarda, bu çözümlerin ortalamada %10-%15 düzeyinde bir iyileştirme sağladığı gözlemlenmiştir. Ayrıca, düzeltici bakım maliyeti, acil sipariş maliyeti ve fazla yedek parçayı geri döndürme maliyeti arttıkça, pekiştirmeli öğrenme algoritması ile elde edilen çözümlerin diğer sezgisel politikalara kıyasla daha fazla avantaj sağladığı da belirlenmiştir.</p></abstract>
                                                            
            
                                                                                        <kwd-group>
                                                    <kwd>Kısmi gözlemlenebilir çok bileşenli sistemler</kwd>
                                                    <kwd>  Kısmi gözlemlenebilir Markov karar süreçleri</kwd>
                                                    <kwd>  Pekiştirmeli öğrenme metotları</kwd>
                                                    <kwd>  Koşula bağlı bakım problemler</kwd>
                                            </kwd-group>
                            
                                                <kwd-group xml:lang="en">
                                                    <kwd>Partially observable multi-component systems</kwd>
                                                    <kwd>  Partially observable Markov decision processes</kwd>
                                                    <kwd>  Reinforcement learning methods</kwd>
                                                    <kwd>  Condition-based maintenance problem</kwd>
                                            </kwd-group>
                                                                                                                                        </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">[1] Zhang M, Revie M. “Continuous-observation partially observable semi-Markov decision processes for machine maintenance”. IEEE Transactions on Reliability,  66(1), 202-218, 2016.</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">[2] Alaswad S, Xiang Y. “A review on condition-based maintenance optimization models for stochastically deteriorating system”. Reliability Engineering &amp; System Safety, 157, 54-63, 2017.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">[3] De Jonge B, Scarf PA “A review on maintenance optimization”. European Journal of Operational Research, 285(3), 805-824, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">[4] Karabağ O, Bulut Ö, Toy AÖ, Fadıloğlu MF. “An efficient procedure for optimal maintenance intervention in partially observable multi-component systems”. Reliability Engineering &amp; System Safety, 244, 1-11, 2024.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">[5] Karabağ O, Eruguz AS, Basten R. “Integrated optimization of maintenance interventions and spare part selection for a partially observable multi-component system”. Reliability Engineering &amp; System Safety, 200, 1-12, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">[6] Karabağ O, Bulut Ö, Toy, AÖ. “Markovian decision process modeling approach for intervention planning of partially observable systems prone to failures”. International Conference on Intelligent and Fuzzy Systems (INFUS), İzmir, Türkiye, 19-21 July 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">[7] Quatrini E, Costantino F, Di Gravio G, Patriarca R. “Condition-based maintenance-an extensive literature review”. Machines, 8(2), 1-28, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">[8] Gürsoy MÜ, Çolak UC, Gökçe MH, Akkulak C, Ötleş S. “Endüstri için kestirimci bakım”. International Journal of 3D Printing Technologies and Digital Industry,  3(1), 56-66, 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">[9] Van Horenbeek A, Buré J, Cattrysse D, Pintelon L., Vansteenwegen P. “Joint maintenance and inventory optimization systems: A review”. International Journal of Production Economics, 143(2), 499-508, 2013.</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">[10] Nguyen KT, Do P, Huynh KT, Bérenguer C, Grall A. “Joint optimization of monitoring quality and replacement decisions in condition-based maintenance”. Reliability Engineering &amp; System Safety. 189(1), 177-95, 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">[11] Liu X, Sun Q, Ye ZS, Yildirim M. “Optimal multi-type inspection policy for systems with imperfect online monitoring”. Reliability Engineering &amp; System Safety. 207(1), 1-11, 2021.</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">[12] Zhao Y, Smidts C. “Reinforcement learning for adaptive maintenance policy optimization under imperfect knowledge of the system degradation model and partial observability of system states”. Reliability Engineering &amp; System Safety. 224(1), 1-13, 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">[13] Tseremoglou I, Santos BF. “Condition-Based Maintenance scheduling of an aircraft fleet under partial observability: A Deep Reinforcement Learning approach”. Reliability Engineering &amp; System Safety. 241(1), 1-20, 2024.</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">[14] Andriotis CP, Papakonstantinou KG. “Managing engineering systems with large state and action spaces through deep reinforcement learning”. Reliability Engineering &amp; System Safety. 191(1), 1-17, 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">[15] Andriotis CP, Papakonstantinou KG. “Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints”. Reliability Engineering &amp; System Safety. 212(1), 1-16, 2021.</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">[16] Zhang N, Si W. “Deep reinforcement learning for condition-based maintenance planning of multicomponent systems under dependent competing risks”. Reliability Engineering &amp; System Safety. 203(1), 1-10, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">[17] Mohammadi R, He Q. “A deep reinforcement learning approach for rail renewal and maintenance planning”. Reliability Engineering &amp; System Safety. 225(1), 1-12, 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">[18] Lovejoy WS. “Computationally feasible bounds for partially observed Markov decision processes”. Operations Research, 39(1), 162-175, 1991.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">[19] Kıvanç İ, Özgür-Ünlüakın D, Bilgiç T. “Maintenance policy analysis of the regenerative air heater system using factored POMDPs”. Reliability Engineering &amp; System Safety, 219, 1-13, 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">[20] Ceyhan H, Kasapbaşı MC. “Üretim sistemlerinde makine öğrenmesi ile kestirimci bakım uygulaması ve modellemesi”. Avrupa Bilim ve Teknoloji Dergisi,  33, 167-175, 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">[21] Calayır GN, Kabak M. “Bakım için makine öğrenme tekniklerinin analizi ve bir uygulama”. Journal of Turkish Operations Management, 5(1), 662-675, 2021.</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">[22] Gençer MA, Yumuşak R, Özcan E, Tamer E. “An artificial neural network model for maintenance planning of metro trains”. Politeknik Dergisi, 24(3), 811-820, 2021.</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">[23] Güven Ö, Şahin H. “Predictive maintenance based on machine learning in public transportation vehicles”. Mühendislik Bilimleri ve Araştırmaları Dergisi, 4(1), 89-98, 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">[24] Soylu B, Yiğiter H, Sarıkaya V, Sandıkçı Z, Asena U. “Kestirimci bakım planlama için makine öğrenmesi temelli bir karar destek sistemi ve bir uygulama”. Verimlilik Dergisi, 2209(B), 48-66, 2022.</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">[25] Hatipoğlu A, Güneri Y, Yılmaz E. “Makine ve derin öğrenme temelli karşılaştırmalı bir öngörücü bakım uygulaması”. Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, 39(2), 1037-1048, 2023.</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">[26] Bertsekas DP. Dynamic Programming and Optimal Control, Volume-II. 4th ed. Belmont, USA, Athena Scientific, 2012.</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">[27] Puterman ML. Markov Decision Process: Discrete Stochastic Dynamic Programming. 1st ed. Hoboken, USA, John Wiley &amp; Sons, 2014.</mixed-citation>
                    </ref>
                                    <ref id="ref28">
                        <label>28</label>
                        <mixed-citation publication-type="journal">[28] Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Massachusetts, USA, MIT Press, 2018.</mixed-citation>
                    </ref>
                                    <ref id="ref29">
                        <label>29</label>
                        <mixed-citation publication-type="journal">[29] Estanjini RM, Li K, Paschalidis IC. “A least squares temporal difference actor–critic algorithm with applications to warehouse management. Naval Research Logistics, 59(3‐4), 197-211, 2012.</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
