FIDELA, ALIFAH and Stiawan, Deris and Septian, Tri Wanda (2022) KLASIFIKASI PDF MALWARE PADA GARBA RUJUKAN DIGITAL (GARUDA) KEMDIKBUD DIKTI DENGAN METODE RANDOM FOREST. Undergraduate thesis, Sriwijaya University.
Text
RAMA_56201_09011281823039.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_56201_09011281823039_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
|
Preview |
Text
RAMA_56201_09011281823039_0003047905_0028098902_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (752kB) | Preview |
Text
RAMA_56201_09011281823039_0003047905_0028098902_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_56201_09011281823039_0003047905_0028098902_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (615kB) | Request a copy |
|
Text
RAMA_56201_09011281823039_0003047905_0028098902_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (261kB) | Request a copy |
|
Text
RAMA_56201_09011281823039_0003047905_0028098902_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (48kB) | Request a copy |
|
Text
RAMA_56201_09011281823039_0003047905_0028098902_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (139kB) | Request a copy |
|
Text
RAMA_56201_09011281823039_0003047905_0028098902_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (996kB) | Request a copy |
Abstract
The Portable Document Format (PDF) is one of the most commonly used document reader formats, the object structure in PDF is flexible and easy to use. Therefore, that hackers use PDFs to carry out the attacks. The dataset comes from the Garba Rujukan Digital (GARUDA), which consists of a collection of PDF files. PDF files will extract using the pdfid tools to get features used in the multiclass classification process. This research dataset has imbalanced data conditions. Overcoming imbalanced data by resampling using oversampling with SMOTE and undersampling with NearMiss. The classification results using the Random Forest method produce an accuracy rate of 99.94%, a precision of 99,95%, a recall of 99,94%, an F1-Score of 99.94%, and an OOB-Error of 0.06%. Then validation was carried out for the accuracy rate of the model using Stratified K-fold Cross Validation, and the highest average accuracy obtained using 7-fold was 99.74%.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | Random Forest, PDF Malware, Multiclass, PDFiD. |
Subjects: | Q Science > Q Science (General) > Q300-390 Cybernetics > Q325.5 Machine learning Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation. Q Science > QA Mathematics > QA75-76.95 Calculating machines > QA76.9.A25 Computer security. Systems and Data Security. T Technology > T Technology (General) > T1-995 Technology (General) |
Divisions: | 09-Faculty of Computer Science > 56201-Computer Systems (S1) |
Depositing User: | Alifah Fidela |
Date Deposited: | 16 Jan 2023 06:40 |
Last Modified: | 16 Jan 2023 06:40 |
URI: | http://repository.unsri.ac.id/id/eprint/86331 |
Actions (login required)
View Item |