A Comparative Performance Analysis of Classification Algorithms for Hypertension Diagnosis
DOI:
https://doi.org/10.37278/sisinfo.v8i1.1491Keywords:
Naïve Baiyes, SVM, Random Forest, XGBoost, HypertensionAbstract
Hypertension is a leading cause of cardiovascular diseases, strokes, and kidney failure, with early diagnosis being critical for prevention. Traditional diagnostic methods often face challenges such as human error and inconsistent measurements. While machine learning (ML) has been explored as a potential solution, previous studies have mainly focused on accuracy, often neglecting other important metrics like precision, recall, and F1-score, especially in imbalanced datasets. The primary purpose of this research is to address this gap by comprehensively comparing the performance of four machine learning algorithms - Naive Bayes, Support Vector Machines (SVM), Random Forest (RF), and XGBoost—to provide valuable insights for practical hypertension screening. The dataset consists of 1,985 records with 10 predictor features, including both categorical and continuous variables, and a binary target variable (Has_Hypertension: Yes/No) with a class distribution of 1,032 Yes and 953 No. The data undergoes preprocessing, including categorical encoding and feature scaling for SVM. Models are evaluated using a balanced set of metrics, including accuracy, precision, recall, and F1-score. The results show that RF/XGBoost perform best, with the highest F1 and accuracy, while SVM and Naive Bayes serve as competitive alternatives.
References
E. J. Topol, "High-performance medicine: the convergence of human and artificial intelligence," Nature Medicine, vol. 25, no. 1, pp. 44-56, 2019.
B. Shickel, P. J. Tighe, A. Bihorac, and P. Rashidi, "Deep EHR: A survey of recent advances in deep learning techniques for electronic health record analysis," IEEE J. Biomed. Health Inform., vol. 22, no. 5, pp. 1589-1604, 2018.
E. Christodoulou et al., "A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models," J. Clin. Epidemiol., vol. 110, pp. 12-22, 2019.
T. Chen and C. Guestrin, "XGBoost: A scalable tree boosting system," Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, pp. 785-794, 2016.
L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
R. Kohavi, "A study of cross-validation and bootstrap for accuracy estimation and model selection," Proc. IJCAI, pp. 1137-1143, 1995.
D. Chicco and G. Jurman, "The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation," BMC Genomics, vol. 21, no. 1, Art. no. 6, 2020.
F. Pedregosa et al., "Scikit-learn: Machine learning in Python," J. Mach. Learn. Res., vol. 12, pp. 2825-2830, 2011.
S. Garcia, S. Ramirez-Gallego, J. Luengo, J. M. Benitez, and F. Herrera, "Big data preprocessing: methods and prospects," Big Data Analytics, vol. 1, Art. no. 9, 2016.
N. Chamidah, E. Z. Astuti, and S. Slamin, "Comparison of Min-Max and Z-Score normalization for breast cancer classification," Jurnal RESTI, vol. 6, no. 1, pp. 10-15, 2022.
P. W. Handayani et al., "Health information systems research in Indonesia: A systematic review," Heliyon, vol. 6, no. 8, Art. no. e04588, 2020.
O. D. Nurhayati et al., "Penerapan machine learning untuk klasifikasi penyakit," Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 8, no. 3, pp. 501-510, 2021.
A. Wibowo and D. Riana, "Analisis performa algoritma klasifikasi pada data medis," Jurnal Sistem Informasi, vol. 16, no. 2, pp. 93-104, 2020.
Kementerian Kesehatan Republik Indonesia, Profil Kesehatan Indonesia 2022. Jakarta, Indonesia: Kemenkes RI, 2022.
Suyanto, Machine Learning Tingkat Dasar dan Lanjut. Bandung, Indonesia: Informatika, 2018.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Imannudin Akbar, Titan Parama Yoga, Acep Hendra, Arnold Ropen Sinaga

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish articles in SisInfo : Jurnal Sistem Informasi dan Informatika agree to the following terms:
- Authors retain copyright of the article and grant the journal right of first publication with the work simultaneously licensed under a CC-BY-SA or The Creative Commons Attribution-ShareAlike License.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
