Evaluating Deep Reinforcement Learning Models in Automated Trading Systems
Rama Rao Gose, A Sandeep, Shaik Munnisa Begum
This research presents a comprehensive evaluation of Deep Reinforcement Learning (DRL) models—
specifically, Deep Q-Network (DQN), Deep Deterministic Policy Gradient (DDPG), and Proximal Policy
Optimization (PPO)—in the context of automated trading systems. The study compares these models
across critical performance metrics, including cumulative returns, Sharpe ratio, maximum drawdown,
and the number of profitable trades, to assess their effectiveness in dynamic and complex financial
markets. Our findings indicate that PPO outperforms DQN and DDPG in terms of both profitability
and risk management, achieving the highest cumulative return and the best risk-adjusted performance.
DDPG also demonstrates strong potential, particularly in handling continuous action spaces, while
DQN shows effectiveness in simpler, discrete decision-making environments. These results underscore
the capability of DRL models to enhance automated trading strategies by adapting to evolving market
conditions and optimizing long-term returns.