Evaluating Deep Reinforcement Learning Models in Automated Trading Systems

Rama Rao Gose, A Sandeep, Shaik Munnisa Begum

This research presents a comprehensive evaluation of Deep Reinforcement Learning (DRL) models— specifically, Deep Q-Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO)—in the context of automated trading systems. The study compares these models across critical performance metrics, including cumulative returns, Sharpe ratio, maximum drawdown, and the number of profitable trades, to assess their effectiveness in dynamic and complex financial markets. Our findings indicate that PPO outperforms DQN and DDPG in terms of both profitability and risk management, achieving the highest cumulative return and the best risk-adjusted performance. DDPG also demonstrates strong potential, particularly in handling continuous action spaces, while DQN shows effectiveness in simpler, discrete decision-making environments. These results underscore the capability of DRL models to enhance automated trading strategies by adapting to evolving market conditions and optimizing long-term returns.
PDF