PRATIWI, ANANDA and Desiani, Anita and Amran, Ali (2024) KOMBINASI METODE MICE DAN ADASYN UNTUK PENANGANAN DATA HILANG DAN KETIDAKSEIMBANGAN DATA PADA KLASIFIKASI PENYAKIT JANTUNG. Undergraduate thesis, Sriwijaya University.
Text
RAMA_44201_08011282025028.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (10MB) | Request a copy |
|
Text
RAMA_44201_08011282025028_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (5MB) | Request a copy |
|
Text
RAMA_44201_08011282025028_0011127702_0013126603_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (3MB) |
|
Text
RAMA_44201_08011282025028_0011127702_0013126603_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_44201_08011282025028_0011127702_0013126603_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (547kB) | Request a copy |
|
Text
RAMA_44201_08011282025028_0011127702_0013126603_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (5MB) | Request a copy |
|
Text
RAMA_44201_08011282025028_0011127702_0013126603_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (139kB) | Request a copy |
|
Text
RAMA_44201_08011282025028_0011127702_0013126603_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (873kB) | Request a copy |
Abstract
The quality of data is determined by several factors, namely data completeness and data balance. The University of California Irvine (UCI) heart disease dataset has issues with missing data and data imbalance, which, if not addressed, can lead to reduced model prediction accuracy and errors in data interpretation. Missing data can be handled using several methods, one of which is data imputation. For missing data less than or equal to 5%, it can be handled using the mode method for nominal attributes. For missing data greater than 5%, the Multiple Imputation by Chained Equations (MICE) method is used. Data imbalance can be addressed using several methods, one of which is oversampling with Adaptive Synthetic Sampling Approach (ADASYN). To see how handling missing data and data imbalance affects the performance of heart disease classification, performance testing was conducted using the Random Forest, Support Vector Machine (SVM), and Multi-Layer Perceptron (MLP) classification methods. There was an increase in classification performance results after handling missing data and data imbalance. Accuracy increased by 18.48%, precision increased by 18.5%, and recall increased by 18.4%. Based on the obtained results, it can be concluded that the MICE and ADASYN methods can improve classification performance on the UCI heart disease dataset
Item Type: | Thesis (Undergraduate) |
---|---|
Subjects: | Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation. |
Divisions: | 08-Faculty of Mathematics and Natural Science > 44201-Mathematics (S1) |
Depositing User: | Ananda Pratiwi |
Date Deposited: | 19 Aug 2024 03:58 |
Last Modified: | 19 Aug 2024 03:58 |
URI: | http://repository.unsri.ac.id/id/eprint/154373 |
Actions (login required)
View Item |