Penerapan Linear Sampling dan Information Gain pada Algoritma Decision Tree untuk Diagnosis Penyakit Diabetes

##plugins.themes.academic_pro.article.main##

Gunawan Gunawan
Ami Rahmawati
Satia Suhada
Taufik Hidayatulloh
Dede Wintana

Abstract

Diabetes which is assigned to be in the top 10 list of diseases that cause death in the last 10 years has increased. What was observed was that this increase occurred in developing countries with middle to lower social status. In Indonesia, diabetes is included in the top 10 diseases with a large number of sufferers. And more than that, diabetes becomes comorbid that causes complications in Covid 19 patients. Then to detect diabetes more quickly and accurately, it is necessary to make research that can produce a better level of accuracy to detect diabetes. By using a public dataset taken from the UCI repository consisting of 520 records, obtained from Diabetes Sylhet Hospital, Bangladesh. In this research, classification will be carried out using the Decision Tree algorithm with optimization of Linear Sampling and Information Gain. After calculating using these methods and calculating the accuracy, the results obtained are 99.04% accuracy with a comparison with previous research which only used a Random Forest of 97.04%.

##plugins.themes.academic_pro.article.details##

Author Biographies

Gunawan Gunawan, Universitas Bina Sarana Informatika

Faculty of Engineering & Informatics
Computer Science Study Program

Ami Rahmawati, Sekolah Tinggi Manajemen Informatika dan Komputer Nusa Mandiri

Information Systems Study Program

Satia Suhada, Sekolah Tinggi Manajemen Informatika dan Komputer Nusa Mandiri

Information Systems Study Program

Taufik Hidayatulloh, Universitas Bina Sarana Informatika

Faculty of Engineering &Informatics
Information Systems Study Program

Dede Wintana, Universitas Bina Sarana Informatika

Faculty of Engineering & Informatics
Computer Science Study Program

How to Cite
Gunawan, G., Rahmawati, A., Suhada, S., Hidayatulloh, T., & Wintana, D. (2022). Penerapan Linear Sampling dan Information Gain pada Algoritma Decision Tree untuk Diagnosis Penyakit Diabetes. MULTINETICS, 7(2), 124–131. https://doi.org/10.32722/multinetics.v7i2.3796

References

  1. WHO, "The Top 10 Causes of Death," World Health Organization, 2020.
  2. WHO, "Diabetes," World Health Organization, 2020.
  3. KEMENKES RI, "INFODATIN Pusat Data dan Informasi Kementerian Kesehatan RI," Kementerian Kesehatan Republik Indonesia, Jakarta Selatan, 2020.
  4. Pangastuti, S. S. (2018). Perbandingan Metode Ensemble Random Forest Dengan Smote-Boosting Dan Smote-Bagging Pada Klasifikasi Data Mining Untuk Kelas Imbalance (Studi Kasus: Data Beasiswa Bidikmisi Tahun 2017 di Jawa Timur)-A Comparison Of The Ensemble Random Forest Methods With Smote-Boosting And Smote-Bagging On Data Mining Classification For Imbalance Class (Doctoral dissertation, Institut Teknologi Sepuluh Nopember).
  5. N. Nurdiana and A. Algifari, "Studi Komparasi Algoritma ID3 dan Algoritma Naive Bayes Untuk Klasifikasi Penyakit Diabetes Mellitus," INFOTECH Journal, pp. 18-23, 2020.
  6. M. F. Salim and S. , "Analisis Rekam Medis Pasien Diabetes Mellitus Melalui Implementasi Teknik Data Mining di RSUP Dr. Sardjito Yogyakarta," JKesV - Jurnal Kesehatan Vokasional, pp. 167-174, 2017.
  7. F. M. Hana, "Klasifikasi Penderita Penyakit Diabetes Menggunakan Algoritma Decision Tree C4.5," Jurnal Sistem Komputer dan Kecerdasan Buatan , pp. 32-39, 2020.
  8. M. M. F. Islam, R. Ferdousi, S. Rahman and H. Y. Bushra, "Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques," in Computer Vision adn Machine Intelligence in Medical Image Analysis, 2019.
  9. I. M. P. Dwipayana and I. M. S. Wirawan, Tanya Jawab Seputar Kencing Manis (Diabetes Mellitus) dan Sakit Maag (Gastritis), Ponorogo: Uwais Inspirasi Indonesia, 2018.
  10. H. Tandra, Segala Sesuatu yang harus Anda Ketahui Tentang Diabetes Panduan Lengkap Mengenal dan Mengatasi Diabetes dengan Cepat dan Mudah Edisi Kedua dan Paling Komplit, Jakarta: PT Gramedia Pustaka Utama, 2017.
  11. I. H. Witten, E. Frank, M. A. Hall and C. J. Pal, Data Mining - Practical Machine Learning Tools and Techniques - Fourth Edition, Chennai: Elsevier, 2017.
  12. D. T. Larose, Discovering Knowledge in Data, New Jersey: Wiley-Interscience, 2005.
  13. D. P. Utomo and M. , "Analisis Komparasi Metode Klasifikasi Data Mining dan Reduksi Atribut Pada Dataset Penyakit Jantung," JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 4, no. 2, pp. 437-444, 2020.
  14. A. P. Ayudhitama and U. Pujianto, "Analisa 4 Algoritma dalam Klasifikasi Penyakit Liver Menggunakan Rapid Miner," JIP (Jurnal Informatika Polinema), vol. 6, no. 2, pp. 1-9, 2020.
  15. L. Rokach and O. Maimon, Data Mining With Decision Tree s Theory and Applications 2nd Edition, Singapore: World Scientific Publishing, 2015.
  16. H. Fujita and A. Selamat, Advancing Technology Industrialization Through Intelligent Software Methodologies, Tools and Techniques, Netherlands: IOS Press BV, 2019.
  17. B. Makhabel, Learning Data Mining with R, Birmingham, UK: Packt Publishing, 2015.
  18. X. Li and C. Claramunt, "A Spatial Entropy-Based Decision Tree for Classification of Geographical Information," Transition in GIS, vol. 10, no. 3, pp. 451-467, 2006.
  19. E. Buulolo, Data Mining Untuk Perguruan Tinggi, Yogyakarta: Deepublish, 2020.
  20. S. Tangirala, "Evaluating the Impact of GINI Index and Information Gain on Classification using Decision Tree Classifier Algorithm," (IJACSA) International Journal of Advanced Computer Science and Applications, vol. 11, no. 2, pp. 612-619, 2020.
  21. S. Bahri, A. Wibowo, R. Wajhillah and S. Suhada, Data Mining; Algoritma Klasifikasi dan Penerapannya Dalam Aplikasi, Yogyakarta: Graha Ilmu, 2019.