عنوان انگلیسی مقاله:
Continuous control with Stacked Deep Dynamic Recurrent Reinforcement Learning for portfolio optimization
ترجمه فارسی عنوان مقاله:
کنترل مداوم با یادگیری تقویتی مجدد پویا عمیق انباشته برای بهینه سازی نمونه کارها
Sciencedirect - Elsevier - Expert Systems With Applications, 140 (2020) 112891. doi:10.1016/j.eswa.2019.112891
Amine Mohamed Aboussalah, Chi-Guhn Lee
Recurrent reinforcement learning (RRL) techniques have been used to optimize asset trading systems and have achieved outstanding results. However, the majority of the previous work has been dedicated to sys- tems with discrete action spaces. To address the challenge of continuous action and multi-dimensional state spaces, we propose the so called Stacked Deep Dynamic Recurrent Reinforcement Learning (SDDRRL) architecture to construct a real-time optimal portfolio. The algorithm captures the up-to-date market con- ditions and rebalances the portfolio accordingly. Under this general vision, Sharpe ratio, which is one of the most widely accepted measures of risk-adjusted returns, has been used as a performance metric. Ad- ditionally, the performance of most machine learning algorithms highly depends on their hyperparameter settings. Therefore, we equipped SDDRRL with the ability to find the best possible architecture topology using an automated Gaussian Process ( GP ) with Expected Improvement ( EI ) as an acquisition function. This allows us to select the best architectures that maximizes the total return while respecting the car- dinality constraints. Finally, our system was trained and tested in an online manner for 20 successive rounds with data for ten selected stocks from different sectors of the S&P 500 from January 1st, 2013 to July 31st, 2017. The experiments reveal that the proposed SDDRRL achieves superior performance com- pared to three benchmarks: the rolling horizon Mean-Variance Optimization (MVO) model, the rolling horizon risk parity model, and the uniform buy-and-hold (UBAH) index.
Keywords: Reinforcement learning | Policy gradient | Deep learning | Sequential model-based optimization | Financial time series | Portfolio management | Trading systems