https://stdjelm.scienceandtechnology.com.vn/index.php/stdjelm/issue/feed VNUHCM Journal of Economics - Law and Management 2026-01-09T14:01:17+07:00 STDJ ECONOMICS - LAW & MANAGEMENT pbthang@inomar.edu.vn Open Journal Systems https://stdjelm.scienceandtechnology.com.vn/index.php/stdjelm/article/view/1687 Sentiment-driven forecasting of the stock index in Vietnam: A machine learning perspective 2026-01-09T14:01:17+07:00 Nguyen Anh Phong phongna@uel.edu.vn Tam Phan Huy tamph@uel.edu.vn Thanh Ngo Phu thanhnp@uel.edu.vn <p>The increasing influence of news sentiment on financial markets has drawn significant attention in recent years, yet its predictive potential in emerging markets remains underexplored. This study investigates whether multi-dimensional sentiment signals, derived from a large corpus of Vietnamnet news articles, can enhance the prediction of daily VN-Index directional movements when combined with technical indicators. Drawing on behavioral finance, information asymmetry, and media framing theories, the research posits that sentiment-laden narratives, alongside price-based signals, provide complementary insights into investor behavior. The dataset comprises 6,480 news articles (2019–2025) and daily VN-Index historical data, from which sentiment features—including polarity, subjectivity, compound scores, and sentiment proportions—are extracted at title, excerpt, and content levels. Technical indicators such as Moving Averages, RSI, MACD, Bollinger Bands, and Volatility are also constructed.</p> <p>The predictive framework is modeled as a binary classification task (“Up” vs. “Unchanged/Down”) and evaluated using multiple machine learning algorithms, including Naive Bayes, Logistic Regression, Support Vector Machine, Random Forest, Gradient Boosting, AdaBoost, and CatBoost. Ten-fold cross-validation with metrics such as accuracy, precision, recall, F1-score, and ROC-AUC ensures robust performance assessment. Results reveal that ensemble models, particularly CatBoost, Gradient Boosting, and Random Forest, consistently outperform linear and probabilistic baselines regarding accuracy, recall, and F1-score. Logistic Regression and AdaBoost also show competitive ROC-AUC values, while Naive Bayes underperforms in distinguishing market movements.</p> <p>These findings underscore the incremental predictive power of sentiment features in an emerging market setting, challenging the semi-strong form of the Efficient Market Hypothesis and reinforcing behavioral finance perspectives on bounded rationality and sentiment-driven trading. Practically, the study offers implications for portfolio managers, policymakers, and media organizations by demonstrating that hybrid sentiment-technical models can improve market forecasting, regulatory monitoring, and responsible financial reporting. Limitations regarding data sources and frequency are acknowledged, and avenues for future research include multi-source sentiment integration, deep learning approaches, and real-time deployment.</p> 2026-01-07T00:00:00+07:00 ##submission.copyrightStatement##