Prediksi Diabetes Berbasis Decision Tree Dengan Menggunakan Dataset Pima Indians Diabetes
DOI:
https://doi.org/10.54259/jdmis.v4i1.7107Keywords:
Diabetes, Pohon Keputusan, Pembelajaran Mesin, Klasifikasi, kemampuan interpretasi, decision tree, Data Mining, Machine Learning, Classification, InterpretabilityAbstract
Diabetes mellitus is a chronic disease characterized by increased blood glucose levels and can lead to various serious complications if not treated early. This research aims to predict diabetes using the Decision Tree algorithm with the Pima Indians Diabetes dataset. The research stages include data processing, forming a Decision Tree model using the entropy criterion, and evaluating model performance. The results show that the model achieved an accuracy of 76.62%. Testing through a confusion matrix produced 83 True Negative samples, 35 True Positive samples, 16 False Positive samples, and 20 False Negative samples. The Glucose attribute was found to be the most dominant factor in the diagnosis, followed by BMI and Age. The resulting model is able to form clear and easy-to-understand decision rules so that it can be used as a decision support system in the early diagnosis of diabetes.
Downloads
References
R. Y. Averina and I. G. N. J. A. Widagda, “肖沉 1, 2, 孙莉 1, 2∆, 曹杉杉 1, 2, 梁浩 1, 2, 程焱 1, 2,” Tjyybjb.Ac.Cn, vol. 27, no. 2, pp. 635–637, 2021.
J. B. Cole and J. C. Florez, “Genetics of diabetes mellitus and diabetes complications,” Nat. Rev. Nephrol., vol. 16, no. 7, pp. 377–390, 2020, doi: 10.1038/s41581-020-0278-5.
A. Mousa, W. Mustafa, and R. B. Marqas, “A Comparative Study of Diabetes Detection Using The Pima Indian Diabetes Database,” J. Univ. Duhok, vol. 26, no. 2, pp. 277–288, 2023, doi: 10.26682/suod.2023.26.2.24.
B. T. Jijo and A. M. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 1, pp. 20–28, 2021, doi: 10.38094/jastt20165.
I. H. Sarker, “Machine Learning: Algorithms, Real-World Applications and Research Directions,” SN Comput. Sci., vol. 2, no. 3, pp. 1–21, 2021, doi: 10.1007/s42979-021-00592-x.
F. Ardyansyah, E. Daniati, and A. Ristyawan, “Pemanfaatan Data Mining untuk Analisis Keputusan,” Agustus, vol. 8, pp. 2549–7952, 2024.
S. Pewekar, M. Tirkey, A. Mallik, R. Shaikh, and S. A. Wagle, “Diabetes Prediction Using Machine Learning,” Lect. Notes Electr. Eng., vol. 1196 LNEE, no. 8, pp. 67–76, 2024, doi: 10.1007/978-981-97-7862-1_5.
A. H. Nasrullah, “Implementasi Algoritma Decision Tree Untuk Klasifikasi Produk Laris,” J. Ilm. Ilmu Komput., vol. 7, no. 2, pp. 45–51, 2021, doi: 10.35329/jiik.v7i2.203.
H. Chen, S. Hu, R. Hua, and X. Zhao, “Improved naive Bayes classification algorithm for traffic risk management,” EURASIP J. Adv. Signal Process., vol. 2021, no. 1, 2021, doi: 10.1186/s13634-021-00742-6.
O. Y. Inonu, K. Magda, and A. Amarudin, “Analisis Kinerja Algoritma Random Forest Dengan Model Machine Learning Pada Dataset Penyakit Diabetes,” Expert J. Manaj. Sist. Inf. dan Teknol., vol. 15, no. 1, p. 1, 2025, doi: 10.36448/expert.v15i1.4312.
Merdin Shamal Salih, “Diabetic Prediction based on Machine Learning Using PIMA Indian Dataset,” Commun. Appl. Nonlinear Anal., vol. 31, no. 5s, pp. 138–156, 2024, doi: 10.52783/cana.v31.1008.
M. Kahn, “Diabetes,” UCI Machine Learning Repository. [Online]. Available: https://doi.org/10.24432/C5T59G
E. O. Manhitu, Y. P. K. Kelen, and D. Chrisinta, “Implementasi algoritma k-nearest neighbor untuk klasifikasi omset usaha mikro di kabupaten timor tengah utara,” Zo. J. Sist. Inf., vol. 7, no. 1, pp. 304–316, 2025.
Putri and Nur, “Penggunaan Bahasa Python Untuk Analisis Dan Visualisasi Data Penduduk Di Desa Sumberjo, Nganjuk,” J. Pengabdi. Kpd. Masy., vol. 3, no. 3, pp. 206–217, 2023, [Online]. Available: https://jurnalfkip.samawa-university.ac.id/karya_jpm/index
A. S. Saabith, T. Vinothraj, and M. Fareez, “A Review on Python Libraries and Ides for Data Science,” Int. J. Res. Eng. Sci. ISSN, vol. 09, no. 11, pp. 36–53, 2021, [Online]. Available: www.ijres.org
M. Azhari, Z. Situmorang, and R. Rosnelly, “Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4.5, Random Forest, SVM dan Naive Bayes,” J. Media Inform. Budidarma, vol. 5, no. 2, p. 640, 2021, doi: 10.30865/mib.v5i2.2937.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Yustri Insani, Marcel Filemon Naibaho, Sardo Pardingotan Sipayung

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).
























