A Soft Voting Ensemble Classifier to Improve Survival Rate Predictions of Cardiovascular Heart Failure Patients


Arif Munandar(1); Wiga Maulana Baihaqi(2*); Ade Nurhopipah(3);

(1) Universitas Amikom Purwokerto
(2) Universitas Amikom Purwokerto
(3) Universitas Amikom Purwokerto
(*) Corresponding Author

  

Abstract


Cardiovascular disease is one of the deadliest diseases, claiming around 17 million lives worldwide each year. According to data from the World Health Organization (WHO), more than four out of five deaths from cardiovascular disease are caused by heart attacks and strokes, and one-third of these deaths occur prematurely in people under the age of 70. Machine learning approaches can be used to detect the disease. This research aims to improve the prediction model of cardiovascular heart failure patient survival using C4.5, KNN, Logistic Regression algorithms, and the ensemble learning method of Voting Classifier. Based on the testing results, each model showed a significant increase in accuracy in the 70:30 ratio. Logistic Regression and C4.5 achieved the same accuracy, 89.47%, KNN obtained 91.23%, and Voting Classifier experienced a considerable improvement, reaching 94.74%. In testing with ratios of 90:10, 80:20, and 70:30, KNN demonstrated high accuracy but had significant overfitting, with a difference of 7-9% between training and testing accuracy scores in the 90:10 and 80:20 ratios. On the other hand, Voting Classifier showed stable performance in the 70:30 ratio, with an accuracy difference between training and testing scores below 1%. The conclusion of this research is that the Voting Classifier can assist the performance improvement of algorithms for classifying the survival expectancy of cardiovascular heart failure patients into 'Survived' or 'Deceased', compared to Logistic Regression, KNN, and C4.5.


Keywords


Cardiovascular; C4.5; Ensemble Learning; K-Nearest Neighbors; Logistic Regression; Machine Learning; Voting Classifier

  
  

Full Text:

PDF
  

Article Metrics

Abstract view: 226 times
PDF view: 83 times
     

Digital Object Identifier

doi  https://doi.org/10.33096/ilkom.v15i2.1632.344-352
  

Cite

References


D. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone,” BMC Med. Inform. Decis. Mak., vol. 20, no. 1, pp. 1–16, 2020.

Edric and S. P. Tamba, “Prediksi penyakit gagal jantung dengan menggunakan random forest,” JUSIKOM PRIMA (Jurnal Sist. Inf. dan Ilmu Komput. Prima), vol. 5, no. 2, pp. 176–181, 2022.

D. A. M. Reza, A. M. Siregar, and Rahmat, “Penerapan Algoritma K-Nearest Neighbord Untuk Prediksi Kematian Akibat Penyakit Gagal Jantung,” Sci. Student J. Information, Techbnology Sci., vol. 3, no. 1, pp. 105–112, 2022.

W. H. Organization, “Cardiovascular Diseases,” World Health Organization, 2023. https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1 (accessed Jan. 19, 2023).

S. E. Awan, M. Bennamoun, F. S. Id, F. M. Sanfilippo, B. J. Chow, and G. D. Id, “Feature selection and transformation by machine learning reduce variable numbers and improve prediction for heart failure readmission or death,” PLoS One, vol. 14, no. 6, pp. 1–13, 2019.

D. Mualfah et al., “Jurnal Computer Science and Information Technology ( CoSciTech ) algoritma random forest,” CoSciTech (Computer Sci. Inf. Technol., vol. 3, no. 2, pp. 107–113, 2022, doi: https://doi.org/10.37859/coscitech.v3i2.3912.

L. Budianti and Suliadi, “Metode Weighted Random Forest dalam Klasifikasi Prediksi,” in Bandung Conference Series: Statistics, 2017, pp. 103–110, doi: https://doi.org/10.29313/bcss.v2i2.3318.

S. M. Chamzah, M. Lestandy, N. Kasan, and A. Nugraha, “Penerapan Synthetic Minority Oversampling Technique ( SMOTE ) untuk Imbalance Class pada Data Text Menggunakan KNN,” Syntax J. Inform., vol. 11, no. 02, pp. 56–67, 2022.

D. S. Permana and A. Silvanie, “Prediksi Penyakit Jantung Menggunakan Support Vector Machine dan Python pada Basis Data Pasien di Cleveland,” JUNIF J. Nas. Inform., vol. 2, no. 1, pp. 29–34, 2021.

W. Nugraha, “Prediksi Penyakit Jantung Cardiovascular Menggunakan Model Algoritma Klasifikasi,” J. Manag. dan Inform., vol. 9, no. 2, pp. 78–84, 2021.

M. R. Mubarok et al., “Hyperparameter Tuning pada Algoritma Klasifikasi dengan Grid Search,” Kumpul. J. Ilmu Komput., vol. 09, no. 02, pp. 391–401, 2022.

Y. Yuliani, “Algoritma Random Forest Untuk Prediksi Kelangsungan Hidup Pasien Gagal Jantung Menggunakan Seleksi Fitur Bestfirst,” Infotek J. Inform. dan Teknol., vol. 5, no. 2, pp. 298–306, 2022.

S. Adi and A. Wintarti, “Komparasi Metode Support Vector Machine (SVM), K-Nearest Neighbors (KNN), dan Random Forest (RF) untuk Prediksi Penyakit Gagal Jantung,” MATHunesa J. Ilm. Mat., vol. 10, no. 02, pp. 258–268, 2022.

D. H. Depari, Y. Widiastiwi, and M. M. Santoni, “Perbandingan Model Decision Tree , Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung,” J. Inform., vol. 4221, no. 18, pp. 239–248, 2022.

A. Indrawati, “Penerapan Teknik Kombinasi Oversampling dan Undersampling untuk Mengatasi Permasalahan Imbalanced Dataset,” JIKO (Jurnal Inform. dan Komputer), vol. 4, no. 1, pp. 38–43, 2021, doi: 10.33387/jiko.

R. Siringoringo, “Klasifikasi Data Tidak Seimbang Menggunakan Algoritma SMOTE dan K-Nearest Neighbor,” J. ISD, vol. 3, no. 1, pp. 44–49, 2018.

W. Nugraha and A. Sasongko, “Hyperparameter Tuning pada Algoritma Klasifikasi dengan Grid Search,” Sist. J. Sist. Inf., vol. 11, no. 2, pp. 391–401, 2022.

M. Ma, A. Prayogo, P. Subarkah, and F. Nida, “Sentiment analysis of customer satisfaction levels on smartphone products using Ensemble Learning,” Ilk. J. Ilm., vol. 14, no. 3, pp. 339–347, 2022, doi: http://dx.doi.org/10.33096/ilkom.v14i3.1377.339-347.

A. Alhamad, A. I. S. Azis, B. Santoso, and S. Taliki, “Prediksi Penyakit Jantung Menggunakan Metode-Metode Machine Learning Berbasis Ensemble – Weighted Vote,” JEPIN (Jurnal Edukasi dan Penelit. Inform., vol. 5, no. 3, pp. 352–360, 2019.

F. Handayani et al., “Komparasi Support Vector Machine , Logistic Regression Dan Artificial Neural Network dalam Prediksi Penyakit Jantung,” JEPIN (Jurnal Edukasi dan Penelit. Inform., vol. 7, no. 3, pp. 329–334, 2021.

H. Hasanah and Nurmalitasari, “Perbandingan Tingkat Akurasi Algoritma Support Vector Machines ( SVM ) dan C45 dalam Prediksi Penyakit Jantung,” in Prosiding Seminar Nasional Teknologi dan Sains, 2023, vol. 2, pp. 13–18.

P. D. Putra and D. P. Rini, “Prediksi Penyakit Jantung dengan Algoritma Klasifikasi,” in Prosiding Annual Research Seminar, 2019, vol. 5, no. 1, pp. 95–99.

M. Amine, S. El, and E. L. Mohammed, “Breast Cancer Prediction and Diagnosis through a New Approach Breast Cancer Prediction and Diagnosis through a New Approach based on Majority Voting Ensemble Classifier based on Majority Voting Ensemble Classifier,” in Procedia Computer Science, 2021, vol. 191, pp. 481–486, doi: 10.1016/j.procs.2021.07.061.

A. C. Lagman, L. P. Alfonso, M. L. I. Goh, J. P. Lalata, and J. P. H. Magcuyao, “Classification Algorithm Accuracy Improvement for Student Graduation Prediction Using Ensemble Model,” Int. J. Inf. Educ. Technol., vol. 10, no. 10, pp. 723–727, 2020, doi: 10.18178/ijiet.2020.10.10.1449.

D. Derisma, “Perbandingan Kinerja Algoritma untuk Prediksi Penyakit Jantung dengan Teknik Data Mining,” J. Appl. Informatics Comput., vol. 4, no. 1, pp. 84–88, 2020, doi: 10.30871/jaic.v4i1.2152.

A. Merdekawati, “Komparasi Algoritma Data Mining dan Perancangan Aplikasi Prediksi Harapan Hidup Pasien Gagal Jantung,” CSRID J., vol. 14, no. 3, pp. 188–202, 2022, doi: https://www.doi.org/10.22303/csrid.14.3.2022.188-2022.


Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Arif Munandar, Wiga Maulana Baihaqi, Ade Nurhopipah

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.