عنوان انگلیسی مقاله:
Large data sets and machine learning: Applications to statistical arbitrage
ترجمه فارسی عنوان مقاله:
مجموعه داده های بزرگ و یادگیری ماشین: برنامه های کاربردی برای آربیتراژ آماری
Sciencedirect - Elsevier - European Journal of Operational Research, 278 (2019) 330-342: doi:10:1016/j:ejor:2019:04:013
Machine learning algorithms and big data are transforming all industries including the finance and port- folio management sectors. While these techniques, such as Deep Belief Networks or Random Forests, are becoming more and more popular on the market, the academic literature is relatively sparse. Through a series of applications involving hundreds of variables/predictors and stocks, this article presents some of the state-of-the-art techniques and how they can be implemented to manage a long-short portfolio. Numerous practical and empirical issues are developed. One of the main questions beyond big data use is the value of information. Does an increase in the number of predictors improve the portfolio perfor- mance? Which features are the most important? A large number of predictors means, potentially, a high level of noise. How do the algorithms manage this? This article develops an application using a 22-year trading period, up to 300 U.S. large caps and around 600 predictors. The empirical results underline the ability of these techniques to generate useful trading signals for portfolios with important turnovers and short holding periods (one or five days). Positive excess returns are reported between 1993 and 2008. They are strongly reduced after accounting for transaction costs and traditional risk factors. When these machine learning tools were readily available in the market, excess returns turned into the negative in most recent times. Results also show that adding features is far from being a guarantee to boost the alpha of the portfolio.
Keywords: Finance | Big data | Machine learning | Statistical arbitrage