Araştırma Makalesi
BibTex RIS Kaynak Göster

Solving an Order Batching and Sequencing Problem with Reinforcement Learning

Yıl 2024, Cilt: 36 Sayı: 3, 235 - 246, 26.09.2024
https://doi.org/10.7240/jeps.1475312

Öz

The purpose of this research is to determine whether a DRL solution would be a suitable solution for the OBSP problem and to compare it with traditional methods. For this purpose, models trained utilizing the PPO algorithm were tested in a complex and realistic warehouse environment, and an attempt was made to measure whether a strategy was developed to decrease the number of orders being late. A heuristic method was also applied and the results were compared on the same environment and data. The results showed that DRL approach that combines heuristics with the PPO algorithm outperforms the heuristics in minimizing the tardy order percentage in all tested scenarios.

Teşekkür

This research was prepared within the scope of Bahçeşehir University postgraduate thesis study. I would like to express my gratitude to my supervisor Assos. Prof. Ayla Gülcü for her valuable guidance and advice which makes this study possible.

Kaynakça

  • Cals, B. J. H. C. (2019). The order batching problem: a deep reinforcement learning approach. (Master Thesis, Eindhoven University of Technology, Eindhoven, Holland). Retrieved from https://research.tue.nl/en/studentTheses/the-order-batching-problem
  • Menéndez, B., Bustillo, M., G. Pardo, E., & Duarte, A. (2017). General Variable Neighborhood Search for the Order Batching and Sequencing Problem. European Journal of Operational Research. 263. 10.1016/j.ejor.2017.05.001.
  • Xiaowei, J., Zhou, Y., Zhang, Y., Sun, L., & Hu, X. (2018). Order batching and sequencing problem under the pick-and-sort strategy in online supermarkets. Procedia Computer Science. 126. 1985-1993. 10.1016/j.procs.2018.07.254.
  • Aylak, B. L. (2022). WAREHOUSE LAYOUT OPTIMIZATION USING ASSOCIATION RULES. FRESENIUS ENVIRONMENTAL BULLETIN, 31(3 A), 3828-3840.
  • Beeks, M. S. (2021). Deep reinforcement learning for solving a multi-objective online order batching problem. (Master Thesis, Eindhoven University of Technology, Eindhoven, Holland). Retrieved from https://research.tue.nl/en/studentTheses/deep-reinforcement-learning-for-solving-a-multi-objective-online-
  • Boysen, N., De Koster, R.B.M, & Weidinger, F. (2018). Warehousing in the e-commerce era: A survey. European Journal of Operational Research. 277. 10.1016/j.ejor.2018.08.023.
  • Aylak, B. L., İnce, M., Oral, O., Süer, G., Almasarwah, N., Singh, M., & Salah, B. (2021). Application of machine learning methods for pallet loading problem. Applied Sciences, 11(18), 8304.
  • Yan, Y., Chow, A.H.F., Ho, C.P., Kuo, Y.H., Wu, Q., & Ying, C. (2021). Reinforcement Learning for Logistics and Supply Chain Management: Methodologies, State of the Art, and Future Opportunities. Retrieved from SSRN: https://ssrn.com/abstract=3935816
  • Henn, S. & Schmid, V. (2011). Metaheuristics for Order Batching and Sequencing in Manual Order Picking Systems. Computers and Industrial Engineering. 66. 10.1016/j.cie.2013.07.003.
  • Henn, S. (2012). Order batching and sequencing for the minimization of the total tardiness in picker-to-part warehouses. Flexible Services and Manufacturing Journal. 27. 10.1007/s10696-012-9164-1.
  • Tsai, C.-Y., Liou, J. J. H., & Huang, T.-M. (2008). Using a multiple-GA method to solve the batch picking problem: considering travel distance and order due time. International Journal of Production Research, 46:22, 6533-6555. DOI: 10.1080/00207540701441947
  • Valle, C.A., Beasley, J.E., & Cunha, A.S. (2017). Optimally solving the joint order batching and picker routing problem. European Journal of Operational Research. 10.1016/j.ejor.2017.03.069.
  • Cals, B., Zhang, Y., Dijkman, R. M., & van Dorst, C. (2021). Solving the Online Batching Problem using Deep Reinforcement Learning. Computers & Industrial Engineering, 156, [107221]. https://doi.org/10.1016/j.cie.2021.107221
  • Hildebrand, M., Frendrup, J., & Sarivan, M. (2019). Batching using reinforcement learning. The 7th Student Symposium on Mechanical and Manufacturing Engineering. Department of Materials and Production, Aalborg University.
  • Beeks, M., Refaei Afshar, R., Zhang, Y., Dijkman, R., Dorst, C. & Looijer, S. (2022). Deep Reinforcement Learning for a Multi-Objective Online Order Batching Problem. In Proceedings of the International Conference on Automated Planning and Scheduling. 32. 435-443. 10.1609/icaps.v32i1.19829.
  • Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. & Zaremba, W. (2016).
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.
  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
  • Lopes, G.C., Ferreira, M., da Silva Simões, A., & Colombini, E. L. (2018) Intelligent Control of a Quadrotor with Proximal Policy Optimization Reinforcement Learning. 2018 Latin American Robotic Symposium, 2018 Brazilian Symposium on Robotics (SBR) and 2018 Workshop on Robotics in Education (WRE), João Pessoa, Brazil, 2018, pp. 503-508, doi: 10.1109/LARS/SBR/WRE.2018.00094.
  • Funika, W., Koperek, P., & Kitowski, J. (2020). Automatic Management of Cloud Applications with Use of Proximal Policy Optimization. In: Krzhizhanovskaya, V., et al. Computational Science – ICCS 2020. ICCS 2020. Lecture Notes in Computer Science, vol 12137. Springer, Cham. https://doi.org/10.1007/978-3-030-50371-0_6 OpenAI Gym. https://doi.org/10.48550/arXiv.1606.01540
  • Kechinov, M. (2020). eCommerce purchase history from electronics store [Data file]. Retrieved from https://www.kaggle.com/datasets/mkechinov/ecommerce-purchase-history-from-electronics-store
  • Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., & Dormann, N. (2021). Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22 (2021) 1-8
  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
  • Wang, Y., He, H. & Tan, X. (2020). Truly Proximal Policy Optimization. Proceedings of the 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:113-122 Available from https://proceedings.mlr.press/v115/wang20b.html.
  • Cobbe, K. W., Hilton, J., Klimov, O., & Schulman, J. (2021). Phasic policy gradient. In International Conference on Machine Learning (pp. 2020-2027). PMLR.

Pekiştirmeli Öğrenme ile Sipariş Yığınlama ve Sıralama Probleminin Çözülmesi

Yıl 2024, Cilt: 36 Sayı: 3, 235 - 246, 26.09.2024
https://doi.org/10.7240/jeps.1475312

Öz

Bu araştırmanın amacı bir DRL çözümünün OBSP problemi için uygun bir çözüm olup olmayacağını belirlemek ve geleneksel yöntemlerle karşılaştırmaktır. Bu amaçla karmaşık ve gerçekçi bir warehouse ortamında PPO algoritması ile eğitilen modeller test edilmiş, geç siparişlerin sayısını azaltacak bir strateji geliştirilip geliştirilmediği ölçülmeye çalışılmıştır. Bir heuristic metod da aynı ortam ve veriler üzerinde uygulanarak sonuçlar karşılaştırılmıştır. Sonuçlar, heuristic yöntemi PPO algoritmasıyla birleştiren DRL yaklaşımının, tüm test edilen senaryolarda geç kalan sipariş yüzdesini en aza indirmede heuristic yöntemlere göre daha iyi bir performansa sahip olduğunu gösterdi.

Kaynakça

  • Cals, B. J. H. C. (2019). The order batching problem: a deep reinforcement learning approach. (Master Thesis, Eindhoven University of Technology, Eindhoven, Holland). Retrieved from https://research.tue.nl/en/studentTheses/the-order-batching-problem
  • Menéndez, B., Bustillo, M., G. Pardo, E., & Duarte, A. (2017). General Variable Neighborhood Search for the Order Batching and Sequencing Problem. European Journal of Operational Research. 263. 10.1016/j.ejor.2017.05.001.
  • Xiaowei, J., Zhou, Y., Zhang, Y., Sun, L., & Hu, X. (2018). Order batching and sequencing problem under the pick-and-sort strategy in online supermarkets. Procedia Computer Science. 126. 1985-1993. 10.1016/j.procs.2018.07.254.
  • Aylak, B. L. (2022). WAREHOUSE LAYOUT OPTIMIZATION USING ASSOCIATION RULES. FRESENIUS ENVIRONMENTAL BULLETIN, 31(3 A), 3828-3840.
  • Beeks, M. S. (2021). Deep reinforcement learning for solving a multi-objective online order batching problem. (Master Thesis, Eindhoven University of Technology, Eindhoven, Holland). Retrieved from https://research.tue.nl/en/studentTheses/deep-reinforcement-learning-for-solving-a-multi-objective-online-
  • Boysen, N., De Koster, R.B.M, & Weidinger, F. (2018). Warehousing in the e-commerce era: A survey. European Journal of Operational Research. 277. 10.1016/j.ejor.2018.08.023.
  • Aylak, B. L., İnce, M., Oral, O., Süer, G., Almasarwah, N., Singh, M., & Salah, B. (2021). Application of machine learning methods for pallet loading problem. Applied Sciences, 11(18), 8304.
  • Yan, Y., Chow, A.H.F., Ho, C.P., Kuo, Y.H., Wu, Q., & Ying, C. (2021). Reinforcement Learning for Logistics and Supply Chain Management: Methodologies, State of the Art, and Future Opportunities. Retrieved from SSRN: https://ssrn.com/abstract=3935816
  • Henn, S. & Schmid, V. (2011). Metaheuristics for Order Batching and Sequencing in Manual Order Picking Systems. Computers and Industrial Engineering. 66. 10.1016/j.cie.2013.07.003.
  • Henn, S. (2012). Order batching and sequencing for the minimization of the total tardiness in picker-to-part warehouses. Flexible Services and Manufacturing Journal. 27. 10.1007/s10696-012-9164-1.
  • Tsai, C.-Y., Liou, J. J. H., & Huang, T.-M. (2008). Using a multiple-GA method to solve the batch picking problem: considering travel distance and order due time. International Journal of Production Research, 46:22, 6533-6555. DOI: 10.1080/00207540701441947
  • Valle, C.A., Beasley, J.E., & Cunha, A.S. (2017). Optimally solving the joint order batching and picker routing problem. European Journal of Operational Research. 10.1016/j.ejor.2017.03.069.
  • Cals, B., Zhang, Y., Dijkman, R. M., & van Dorst, C. (2021). Solving the Online Batching Problem using Deep Reinforcement Learning. Computers & Industrial Engineering, 156, [107221]. https://doi.org/10.1016/j.cie.2021.107221
  • Hildebrand, M., Frendrup, J., & Sarivan, M. (2019). Batching using reinforcement learning. The 7th Student Symposium on Mechanical and Manufacturing Engineering. Department of Materials and Production, Aalborg University.
  • Beeks, M., Refaei Afshar, R., Zhang, Y., Dijkman, R., Dorst, C. & Looijer, S. (2022). Deep Reinforcement Learning for a Multi-Objective Online Order Batching Problem. In Proceedings of the International Conference on Automated Planning and Scheduling. 32. 435-443. 10.1609/icaps.v32i1.19829.
  • Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. & Zaremba, W. (2016).
  • Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.
  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://doi.org/10.48550/arXiv.1707.06347
  • Lopes, G.C., Ferreira, M., da Silva Simões, A., & Colombini, E. L. (2018) Intelligent Control of a Quadrotor with Proximal Policy Optimization Reinforcement Learning. 2018 Latin American Robotic Symposium, 2018 Brazilian Symposium on Robotics (SBR) and 2018 Workshop on Robotics in Education (WRE), João Pessoa, Brazil, 2018, pp. 503-508, doi: 10.1109/LARS/SBR/WRE.2018.00094.
  • Funika, W., Koperek, P., & Kitowski, J. (2020). Automatic Management of Cloud Applications with Use of Proximal Policy Optimization. In: Krzhizhanovskaya, V., et al. Computational Science – ICCS 2020. ICCS 2020. Lecture Notes in Computer Science, vol 12137. Springer, Cham. https://doi.org/10.1007/978-3-030-50371-0_6 OpenAI Gym. https://doi.org/10.48550/arXiv.1606.01540
  • Kechinov, M. (2020). eCommerce purchase history from electronics store [Data file]. Retrieved from https://www.kaggle.com/datasets/mkechinov/ecommerce-purchase-history-from-electronics-store
  • Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., & Dormann, N. (2021). Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22 (2021) 1-8
  • Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., ... & Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32.
  • Wang, Y., He, H. & Tan, X. (2020). Truly Proximal Policy Optimization. Proceedings of the 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:113-122 Available from https://proceedings.mlr.press/v115/wang20b.html.
  • Cobbe, K. W., Hilton, J., Klimov, O., & Schulman, J. (2021). Phasic policy gradient. In International Conference on Machine Learning (pp. 2020-2027). PMLR.
Toplam 25 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Yazılım Mühendisliği (Diğer)
Bölüm Araştırma Makaleleri
Yazarlar

Begüm Canaslan 0009-0004-3662-7291

Ayla Gülcü 0000-0003-3258-8681

Erken Görünüm Tarihi 19 Eylül 2024
Yayımlanma Tarihi 26 Eylül 2024
Gönderilme Tarihi 29 Nisan 2024
Kabul Tarihi 27 Haziran 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 36 Sayı: 3

Kaynak Göster

APA Canaslan, B., & Gülcü, A. (2024). Solving an Order Batching and Sequencing Problem with Reinforcement Learning. International Journal of Advances in Engineering and Pure Sciences, 36(3), 235-246. https://doi.org/10.7240/jeps.1475312
AMA Canaslan B, Gülcü A. Solving an Order Batching and Sequencing Problem with Reinforcement Learning. JEPS. Eylül 2024;36(3):235-246. doi:10.7240/jeps.1475312
Chicago Canaslan, Begüm, ve Ayla Gülcü. “Solving an Order Batching and Sequencing Problem With Reinforcement Learning”. International Journal of Advances in Engineering and Pure Sciences 36, sy. 3 (Eylül 2024): 235-46. https://doi.org/10.7240/jeps.1475312.
EndNote Canaslan B, Gülcü A (01 Eylül 2024) Solving an Order Batching and Sequencing Problem with Reinforcement Learning. International Journal of Advances in Engineering and Pure Sciences 36 3 235–246.
IEEE B. Canaslan ve A. Gülcü, “Solving an Order Batching and Sequencing Problem with Reinforcement Learning”, JEPS, c. 36, sy. 3, ss. 235–246, 2024, doi: 10.7240/jeps.1475312.
ISNAD Canaslan, Begüm - Gülcü, Ayla. “Solving an Order Batching and Sequencing Problem With Reinforcement Learning”. International Journal of Advances in Engineering and Pure Sciences 36/3 (Eylül 2024), 235-246. https://doi.org/10.7240/jeps.1475312.
JAMA Canaslan B, Gülcü A. Solving an Order Batching and Sequencing Problem with Reinforcement Learning. JEPS. 2024;36:235–246.
MLA Canaslan, Begüm ve Ayla Gülcü. “Solving an Order Batching and Sequencing Problem With Reinforcement Learning”. International Journal of Advances in Engineering and Pure Sciences, c. 36, sy. 3, 2024, ss. 235-46, doi:10.7240/jeps.1475312.
Vancouver Canaslan B, Gülcü A. Solving an Order Batching and Sequencing Problem with Reinforcement Learning. JEPS. 2024;36(3):235-46.