Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul

Hidayet Beyhan; Erhan Ergin; Binali Selman Eren

doi:10.30784/epfad.1811319

EN TR

Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul

Abstract

In this study, portfolio optimization has been conducted using the reinforcement learning approach, one of the artificial intelligence algorithms. The data is considered for constituents of the BIST30 index, which is the blue-chip index of Borsa Istanbul. The performance of Deep Deterministic Policy Gradient (DDPG), a deep learning algorithm of reinforcement learning, has been tested against the Markowitz mean-variance and equal-weighted portfolios as benchmark models; the BIST30 index itself has also been taken as a benchmark portfolio. This study contributes to the relevant literature in terms of Türkiye as an example of a developing country and the method employed. The study demonstrates the potential of RL approaches that are becoming widespread for portfolio optimization. The obtained results reveal that the portfolio formed with the DDPG approach shows a superior Sharpe ratio portfolio over portfolios obtained with other classical approaches. These findings, while highlighting the potential of RL approaches in practice, emerge as an alternative option for fund managers, especially in a volatile market environment.

Keywords

Derin Pekiştirmeli Öğrenme ile Dinamik Portföy Optimizasyonu: Borsa İstanbul Örneği

Abstract

Bu çalışmada, yapay zekâ algoritmalarından biri olan pekiştirmeli öğrenme yaklaşımı kullanılarak portföy optimizasyonu gerçekleştirilmiştir. Veriler, Borsa İstanbul endeksi olan BIST30 endeksinin bileşenleri için ele alınmıştır. Pekiştirmeli öğrenmenin derin öğrenme algoritmalarından biri olan Derin Deterministik Politika Gradyanının (DDPG) performansı, kıyaslama modeli olarak Markowitz ortalama-varyans ve eşit ağırlıklı portföylere karşı test edilmiş, ayrıca BIST30 endeksinin kendisi de kıyaslama portföyü olarak alınmıştır. Bu çalışma, gelişmekte olan bir ülke örneği olarak Türkiye ve kullanılan yöntem açısından ilgili literatüre katkıda bulunmaktadır. Çalışma, giderek yaygınlaşan RL yaklaşımlarının portföy optimizasyonu için potansiyelini göstermektedir. Elde edilen sonuçlar, DDPG yaklaşımıyla oluşturulan portföyün, diğer klasik yaklaşımlarla elde edilen portföylere göre daha üstün bir Sharpe oranına sahip portföy gösterdiğini ortaya koymaktadır. Bu bulgular, RL yaklaşımlarının pratikteki potansiyelini vurgularken, özellikle dalgalı piyasa ortamında fon yöneticileri için alternatif bir seçenek olarak ortaya çıkmaktadır.

Keywords

References

Aboussalah, A.M. and Lee, C.G. (2020). Continuous control with stacked deep dynamic recurrent reinforcement learning for portfolio optimization. Expert Systems with Applications, 140, 112891. https://doi.org/10.1016/j.eswa.2019.112891
Almahdi, S. and Yang, S.Y. (2017). An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown. Expert Systems with Applications, 87, 267–279. https://doi.org/10.1016/j.eswa.2017.06.023
Bai, Y., Gao, Y., Wan, R., Zhang, S. and Song, R. (2025). A review of reinforcement learning in financial applications. Annual Review of Statistics and Its Application, 12(1), 209–232. https://doi.org/10.48550/arXiv.2411.12746
Bekaert, G. and Harvey, C.R. (2003). Emerging markets finance. Journal of Empirical Finance, 10(1–2), 3–55. https://doi.org/10.1016/S0927-5398(02)00054-3
Black, F. and Litterman, R. (1990). Asset allocation: Combining investor views with market equilibrium. Journal of Fixed Income, 1(2), 7–18. https://doi.org/10.3905/jfi.1991.408013
Black, F. and Litterman, R. (1992). Global portfolio optimization. Financial Analysts Journal, 48(5), 28–43. https://doi.org/10.2469/faj.v48.n5.28
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. and Zaremba, W. (2016). OpenAI gym. arXiv preprint arXiv:1606.01540. https://doi.org/10.48550/arXiv.1606.01540
De Prado, M.L. (2016). Building diversified portfolios that outperform out-of-sample. Journal of Portfolio Management, 42(4), 59–69. https://doi.org/10.3905/jpm.2016.42.4.059

Deng, Y., Bao, F., Kong, Y., Ren, Z. and Dai, Q. (2016). Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3), 653–664. https://doi.org/10.1109/TNNLS.2016.2522401
Erdoğan, L., Ceylan, R. and Abdul-Rahman, M. (2022). The impact of domestic and global risk factors on Turkish stock market: Evidence from the NARDL approach. Emerging Markets Finance and Trade, 58(7), 1961–1974. https://doi.org/10.1080/1540496X.2021.1949282
Fabozzi, F.J., Kolm, P.N., Pachamanova, D.A. and Focardi, S.M. (2007). Robust portfolio optimization. Journal of Portfolio Management, 33(3), 40–48. https://doi.org/10.3905/jpm.2007.684751
Gort, B.J.D., Liu, X.Y., Sun, X., Gao, J., Chen, S. and Wang, C.D. (2022). Deep reinforcement learning for cryptocurrency trading: Practical approach to address backtest overfitting. arXiv preprint arXiv:2209.05559. https://doi.org/10.48550/arXiv.2209.05559
Jang, J. and Seong, N. (2023). Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Systems with Applications, 218, 119556. https://doi.org/10.1016/j.eswa.2023.119556
Jiang, Z., Xu, D. and Liang, J. (2017). A deep reinforcement learning framework for the financial portfolio management problem. arXiv preprint arXiv:1706.10059. https://doi.org/10.48550/arXiv.1706.10059
Kochliaridis, V., Kouloumpris, E. and Vlahavas, I. (2023). Combining deep reinforcement learning with technical analysis and trend monitoring on cryptocurrency markets. Neural Computing and Applications, 35(29), 21445–21462. https://doi.org/10.1007/s00521-023-08516-x
Liang, Z., Chen, H., Zhu, J., Jiang, K. and Li, Y. (2018). Adversarial deep reinforcement learning in portfolio management. arXiv preprint arXiv:1808.09940. https://doi.org/10.48550/arXiv.1808.09940
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D. and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971. https://doi.org/10.48550/arXiv.1509.02971
Lim, Q.Y.E., Cao, Q. and Quek, C. (2022). Dynamic portfolio rebalancing through reinforcement learning. Neural Computing and Applications, 34(9), 7125–7139. https://doi.org/10.1007/s00521-021-06853-3
Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91. https://doi.org/10.2307/2975974
Meng, T.L. and Khushi, M. (2019). Reinforcement learning in financial markets. Data, 4(3), 110. https://doi.org/10.3390/data4030110
Merton, R.C. (1973). An intertemporal capital asset pricing model. Econometrica, 41(5), 867–887. https://doi.org/10.2307/1913811
Michaud, R.O. (1989). The Markowitz optimization enigma: Is optimized optimal? Financial Analysts Journal, 45(1), 31–42. https://doi.org/10.2469/faj.v45.n1.31
Moody, J. and Saffell, M. (2001). Learning to trade via direct reinforcement. IEEE Transactions on Neural Networks, 12(4), 875–889. https://doi.org/10.1109/72.935097
Ozbayoglu, A.M., Gudelek, M.U. and Sezer, O. B. (2020). Deep learning for financial applications: A survey. Applied Soft Computing, 93, 106384. https://doi.org/10.1016/j.asoc.2020.106384
Qian, E. (2011). Risk parity and diversification. Journal of Investing, 20(1), 119. Retrieved from https://www.panagora.com/
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M. and Dormann, N. (2021). Stable-Baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268), 1–8. Retrieved from https://jmlr.org
Rjoub, H., Türsoy, T. and Günsel, N. (2009). The effects of macroeconomic factors on stock returns: Istanbul Stock Exchange. Studies in Economics and Finance, 26(1), 36–45. https://doi.org/10.1108/10867370910946315
Rockafellar, R.T. and Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2(3), 21–41. Retrieved from https://sites.math.washington.edu
Samuelson, P.A. (1969). Lifetime portfolio selection by dynamic stochastic programming. The Review of Economics and Statistics, 51(3), 239–246. https://doi.org/10.2307/1926559
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D. and Riedmiller, M. (2014). Deterministic policy gradient algorithms. In E. P. Xing and T. Jebara (Eds.), Proceedings of the 31st International Conference on Machine Learning (pp. 387–395). Retrieved from https://proceedings.mlr.press/
Sun, R., Stefanidis, A., Jiang, Z. and Su, J. (2024). Combining transformer based deep reinforcement learning with Black-Litterman model for portfolio optimization. Neural Computing and Applications, 36(32), 20111–20146. https://doi.org/10.1007/s00521-024-09805-9
Sutton, R.S. and Barto, A.G. (2018). Reinforcement learning: An introduction (2nd ed.). Cambridge: MIT Press.
Sutton, R.S., McAllester, D.A., Singh, S.P. and Mansour, Y. (1999). Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems, 12, 1057–1063. Retrieved from https://proceedings.neurips.cc/
Tversky, A. and Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. Retrieved from https://www.jstor.org/
Wang, H. and Zhou, X.Y. (2020). Continuous-time mean–variance portfolio selection: A reinforcement learning framework. Mathematical Finance, 30(4), 1273–1308. https://doi.org/10.48550/arXiv.1904.11392
Williams, R.J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3), 229–256. https://doi.org/10.1007/BF00992696
Yu, P., Lee, J.S., Kulyatin, I., Shi, Z. and Dasgupta, S. (2019). Model-based deep reinforcement learning for dynamic portfolio optimization. arXiv preprint arXiv:1901.08740. https://doi.org/10.48550/arXiv.1901.08740

Details

Primary Language

English

Subjects

Investment and Portfolio Management

Journal Section

Research Article

Authors

Hidayet Beyhan ^*
0000-0002-0219-7076
Türkiye

Erhan Ergin
0000-0001-6281-3654
Türkiye

Binali Selman Eren
0000-0001-5136-6406
Türkiye

Publication Date

March 31, 2026

Submission Date

October 27, 2025

Acceptance Date

March 17, 2026

Published in Issue

Year 2026 Volume: 11 Number: 1

DOI

https://doi.org/10.30784/epfad.1811319

IZ

https://izlik.org/JA43NB89ZH

Cite

RIS / Bibtex

APA

Beyhan, H., Ergin, E., & Eren, B. S. (2026). Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul. Ekonomi Politika Ve Finans Araştırmaları Dergisi, 11(1), 106-119. https://doi.org/10.30784/epfad.1811319

AMA

1.Beyhan H, Ergin E, Eren BS. Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul. EPF Journal. 2026;11(1):106-119. doi:10.30784/epfad.1811319

Chicago

Beyhan, Hidayet, Erhan Ergin, and Binali Selman Eren. 2026. “Dynamic Portfolio Optimization With Deep Reinforcement Learning: Evidence from Borsa Istanbul”. Ekonomi Politika Ve Finans Araştırmaları Dergisi 11 (1): 106-19. https://doi.org/10.30784/epfad.1811319.

EndNote

Beyhan H, Ergin E, Eren BS (March 1, 2026) Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul. Ekonomi Politika ve Finans Araştırmaları Dergisi 11 1 106–119.

IEEE

[1]H. Beyhan, E. Ergin, and B. S. Eren, “Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul”, EPF Journal, vol. 11, no. 1, pp. 106–119, Mar. 2026, doi: 10.30784/epfad.1811319.

ISNAD

Beyhan, Hidayet - Ergin, Erhan - Eren, Binali Selman. “Dynamic Portfolio Optimization With Deep Reinforcement Learning: Evidence from Borsa Istanbul”. Ekonomi Politika ve Finans Araştırmaları Dergisi 11/1 (March 1, 2026): 106-119. https://doi.org/10.30784/epfad.1811319.

JAMA

1.Beyhan H, Ergin E, Eren BS. Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul. EPF Journal. 2026;11:106–119.

MLA

Beyhan, Hidayet, et al. “Dynamic Portfolio Optimization With Deep Reinforcement Learning: Evidence from Borsa Istanbul”. Ekonomi Politika Ve Finans Araştırmaları Dergisi, vol. 11, no. 1, Mar. 2026, pp. 106-19, doi:10.30784/epfad.1811319.

Vancouver

1.Hidayet Beyhan, Erhan Ergin, Binali Selman Eren. Dynamic Portfolio Optimization with Deep Reinforcement Learning: Evidence from Borsa Istanbul. EPF Journal. 2026 Mar. 1;11(1):106-19. doi:10.30784/epfad.1811319