Comparison of Chi-Square and Information Gain Feature Selection Methods for Support Vector Machine-Based Sentiment Analysis
Case Study: Vidio Application Reviews on Google Play Store
DOI:
https://doi.org/10.37278/sisinfo.v8i1.1352Keywords:
Sentiment Analysis, Support Vector Machine, Chi-Square, Information Gain, VidioAbstract
Vidio is a local streaming platform that dominates the Indonesian market, but still faces challenges in improving user satisfaction as reflected by its 3.5 rating. To enhance the application, user experience insights are needed, which can be identified through sentiment analysis. This study aims to analyze the sentiment of Vidio application user reviews and compare the performance of the Support Vector Machine model using Chi-Square and Information Gain feature selection. The dataset comprises 4,670 reviews collected from July 01 to November 30, 2024. Model evaluation utilizes Balanced Accuracy metrics optimized through hyperparameter tuning to ensure fair assessment on imbalanced data. The experimental results demonstrate that Chi-Square feature selection yields the optimal performance, achieving a peak Balanced Accuracy of 94.78%. Significantly, this result was attained using a computationally efficient Linear Kernel (). In contrast, the Information Gain method yielded a lower Balanced Accuracy of 94.20% despite utilizing a complex Polynomial Kernel (). These findings conclude that Chi-Square provides a superior trade-off between classification accuracy and model complexity, offering a more robust solution for sentiment analysis.
References
using InceptionV3 and SVM,” International Journal of Engineering Research & Technology (IJERT), vol. 10, no. 8, pp. 6–10, Aug. 2021.
R. A. Ariyanto and N. Chamidah, “Sentiment Analysis for Zoning System Admission Policy Using Support Vector Machine and Naive Bayes Methods,” J. Phys. Conf. Ser., vol. 1776, no. 1, p. 012058, Feb. 2021, doi: 10.1088/1742-6596/1776/1/012058.
N. M. S. Hadna, P. I. Santosa, and W. W. Winarno, “Studi Literatur Tentang Perbandingan Metode Untuk Proses Analisis Sentimen Di Twitter,” in Seminar Nasional Teknologi Informasi dan Komunikasi (SENTIKA 2016), Yogyakarta, Mar. 2016.
Naiyang. Deng, Yingjie. Tian, and Chunhua. Zhang, Support vector machines : optimization based theory, algorithms, and extensions. CRC Press, Taylor & Francis Group, 2013.
Md. S. Reza, U. Hafsha, R. Amin, R. Yasmin, and S. Ruhi, “Improving SVM performance for type II diabetes prediction with an improved non-linear kernel: Insights from the PIMA dataset,” Computer Methods and Programs in Biomedicine Update, vol. 4, p. 100118, 2023, doi: 10.1016/j.cmpbup.2023.100118.
A. S. Nugroho, A. B. Witarto, and D. Handoko, “Support Vector Machine: Teori dan Aplikasinya dalam Bioinformatika,” https://asnugroho.net/papers/ikcsvm.pdf.
A. Tharwat, “Classification assessment methods,” Applied Computing and Informatics, vol. 17, no. 1, pp. 168–192, Jan. 2021, doi: 10.1016/j.aci.2018.08.003.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Vitta Margaret Sinambela, Herlina Napitupulu, Nurul Gusriani

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish articles in SisInfo : Jurnal Sistem Informasi dan Informatika agree to the following terms:
- Authors retain copyright of the article and grant the journal right of first publication with the work simultaneously licensed under a CC-BY-SA or The Creative Commons Attribution-ShareAlike License.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
