The application of IndoBERT in tourist sentiment analysis:a comparative evaluation with SVM and LSTM

Authors

  • Yoannes Romando Sipayung Departement of Informatics Engineering, Faculty of Computer Science and Education, Universitas Ngudi Waluyo, Indonesia Author
  • Sri Mujiyono Departement of Informatics Engineering, Faculty of Computer Science and Education, Universitas Ngudi Waluyo, Indonesia Author
  • Anni Malihatul Hawa Departement of Informatics Engineering, Faculty of Computer Science and Education, Universitas Ngudi Waluyo, Indonesia Author

DOI:

https://doi.org/10.52465/joscex.v7i2.42

Keywords:

IndoBERT, SVM, LSTM, Tourism, YouTube, Sentiment analysis

Abstract

YouTube comments provide valuable public opinions about tourist destinations, but their informal and unstructured nature makes sentiment analysis challenging. Therefore, an automatic sentiment classification approach is needed to support tourism evaluation and promotion strategies. This study aims to analyze tourist sentiment toward tourism in the Bangka Belitung Islands based on comments on the YouTube platform. The analysis was conducted using a comparative approach with three models: IndoBERT, SVM, and LSTM. The dataset consisted of 1,000 YouTube comments, which were reduced to 913 valid comments after preprocessing, including data cleaning, case folding, normalization, tokenization, and stopword removal. The sentiment distribution consisted of 434 neutral comments, 333 positive comments, and 146 negative comments, indicating an imbalanced class distribution. Model performance was evaluated using accuracy, precision, recall, and F1-score metrics based on a confusion matrix. The results show that IndoBERT performed best with an accuracy of 0.71 and the highest F1-score compared to the other models. The SVM model demonstrated fairly stable performance with an accuracy of 0.69, while the LSTM achieved an accuracy of 0.68 with lower performance on the minority class. The results indicate that transformer-based models are more effective in understanding linguistic context than machine learning and deep learning models. This study is expected to contribute to the development of sentiment analysis based on social media data in the tourism sector.

Downloads

Published

02-05-2026

Issue

Section

Articles

How to Cite

The application of IndoBERT in tourist sentiment analysis:a comparative evaluation with SVM and LSTM. (2026). Journal of Soft Computing Exploration, 7(2), 217-228. https://doi.org/10.52465/joscex.v7i2.42