Algorithmic Alchemy: Turning Data into Stock Market Gold
1 : Guangzhou City University of Technology
2 : University of Macau
3 : University of Hradec Králové
The aim of this paper is to present a comprehensive investigation into the predictive accuracy of 14 models in forecasting monthly stock returns, including 13 machine learning techniques and one simple linear model. Adopting a rich dataset spanning over 53 years with 5175 stocks for a total of 758649 observations, we evaluate the models on out-of-sample predictive R² and identify
Random Forest and specific Neural Networks (NN3 and NN5) as the best performing methods. Our analysis spans six time periods, showing the varying performance of these models and underscoring their adaptability, especially during significant market upheavals like the pandemic. Additionally, the paper identifies key indicators that drive stock returns, including Valuation Ratio, Liquidity, Price Trend, and Chicago Fed National Financial Conditions Index (NFCI). We also reveal that prediction accuracy is primarily driven by data rather than being model-driven. Lastly, we demonstrate that in real-world markets, model-driven portfolios consistently outperform our benchmark, the S&P 500 index return. These results collectively enrich our understanding of machine learning's role in empirical asset pricing and provide practical implications for both scholars and practitioners.