Multi-Agent Deep Reinforcement Learning with Dynamic Portfolio Weighting: A Novel Approach to Algorithmic Trading
Abstract
This paper presents a novel ensemble reinforcement learning framework for multi-asset portfolio management, referred to as the Confidence-Weighted Dynamic Ensemble (CWDE). The proposed model integrates five state-of-the-art actor–critic algorithms—PPO, A2C, DDPG, TD3, and SAC—under a dynamic aggregation mechanism that adjusts model weights based on entropy-derived confidence and historical performance. Using a diversified dataset spanning equities, bonds, commodities, and real estate ETFs, CWDE is benchmarked against its constituent DRL agents. Experimental results demonstrate that CWDE outperforms all baselines, achieving the highest risk-adjusted returns and the lowest drawdowns. Statistical analysis confirms the ensemble’s robustness and adaptability to market volatility. The findings highlight CWDE’s potential to serve as a scalable and interpretable framework for trading intelligence. The study concludes with a discussion of computational and practical limitations and outlines future directions for integrating explainability, macroeconomic features, and real-time deployment.
Keywords
References
- M. Rezaei and H. Nezamabadi-Pour, “A taxonomy of literature reviews and experimental study of deep reinforcement learning in portfolio management,” Artif. Intell. Rev., vol. 58, 2025, doi: 10.1007/s10462-024-11066-w
- A. Aboussalah and C. Lee, “Continuous control with stacked deep dynamic recurrent reinforcement learning for portfolio optimization,” Expert Syst. Appl., vol. 140, Art. no. 112891, 2020, doi: 10.1016/j.eswa.2019.112891
- Y. Zhang, P. Zhao, Q. Wu, B. Li, J. Huang, and M. Tan, “Cost-sensitive portfolio selection via deep reinforcement learning,” IEEE Trans. Knowl. Data Eng., vol. 34, no. 10, pp. 1–12, 2020, doi: 10.1109/tkde.2020.2979700
- A. Aboussalah, Z. Xu, and C. Lee, “What is the value of the cross-sectional approach to deep reinforcement learning?,” Quant. Finance, vol. 22, no. 12, pp. 1–14, 2020, doi: 10.1080/14697688.2021.2001032
- Y. Jiang, J. Olmo, and M. Atwi, “Deep reinforcement learning for portfolio selection,” Global Finance J., vol. 65, Art. no. 101016, 2024, doi: 10.1016/j.gfj.2024.101016
- Y. Lin, C. Chen, C. Sang, and S. Huang, “Multiagent-based deep reinforcement learning for risk-shifting portfolio management,” Appl. Soft Comput., vol. 123, Art. no. 108894, 2022, doi: 10.1016/j.asoc.2022.108894
- C. Chen, J. Zhang, Z. Li, and S. Xu, “Multi-agent deep reinforcement learning algorithm with trend consistency regularization for portfolio management,” Neural Comput. Appl., vol. 35, no. 8, pp. 1–18, 2022, doi: 10.1007/s00521-022-08011-9
- T. Zhao, X. Xu, X. Li, and C. Zhang, “Asset correlation-based deep reinforcement learning for the portfolio selection,” Expert Syst. Appl., vol. 221, Art. no. 119707, 2023, doi: 10.1016/j.eswa.2023.119707
Details
Primary Language
English
Subjects
Artificial Intelligence (Other)
Journal Section
Research Article
Authors
Cemal Öztürk
*
0000-0003-3850-7416
Türkiye
Early Pub Date
June 15, 2026
Publication Date
June 17, 2026
Submission Date
November 17, 2025
Acceptance Date
February 2, 2026
Published in Issue
Year 2026 Volume: 9 Number: 2
