A two-stage sentiment analysis approach on multilingual restaurant reviews in Almaty
Abstract
Sentiment classification has become one of the most widely studied areas in text classification, especially in recent years. This study presents extensive experiments in sentiment analysis, investigating the performance of seven state-of-the-art sentiment analyzers (TextBlob, VADER, AFINN, Stanza, Nlptown, Sentistrength, and Flair) in Stage 1, and an ensemble approach in Stage 2, using multilingual restaurant reviews from Almaty, Kazakhstan. The reviews, either originally written in English or translated from Russian, are analyzed across various sections, including HEAD, TEXT, and their combinations (HEAD+TEXT). The results of Stage 2 ensemble methods demonstrate clear advantages of carefully selected ensembles over individual sentiment analyzers. Specifically, the highest Micro-F1 score for English reviews was 0.733 in the TEXT section, achieved by the ensemble TextBlob+Stanza+Nlptown+Sentistrength. The highest Macro-F1 score for English reviews was 0.684, achieved by the same ensemble in the TEXT section. For Russian reviews, the highest Micro-F1 score was 0.703 in the HEAD+TEXT combination, and the highest Macro-F1 score was 0.642 in the TEXT section, both achieved by the ensemble TextBlob+Stanza+Nlptown+Sentistrength. These findings highlight that the performance of sentiment analyzers varies depending on the original language and the corresponding review section.
Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.