Optimized feature selection based on machine learning models for robust stock market prediction
Abstract
This study aims to predict the US financial stock market through machine learning (ML) methods based on optimized feature selection algorithms. Two prediction models were compared: random forest (RF) and support vector regression (SVR). Seventeen variables are used to explain the movement of the S&P 500, NASDAQ, and DJIA indices. These variables are grouped into five categories: basic features, stock market variables, currencies, commodities, and technical indicators. This research work proceeds by applying a variable selection technique to identify the most relevant variables. The optimal set of selected variables was used for forecasting. The results obtained using SVR and RF after variable selection were compared with those obtained before selection. The outcomes of the comparison between these two Artificial Intelligence (AI) methods favor regression after variable selection. Findings show that the feature selection process has a large and significant impact on improving the prediction accuracy of the studied financial markets.
Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.