RESTI, INDAH CAHYA and Stiawan, Deris and Septian, Tri Wanda (2022) VISUALISASI PDF MALWARE MENGGUNAKAN CLUSTERING K-MEANS PADA LAYANAN GARUDA KEMDIKBUD DIKTI SEBAGAI AGREGATOR NASIONAL. Undergraduate thesis, Sriwijaya University.
Text
RAMA_56201_09011281823046.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (5MB) | Request a copy |
|
Preview |
Text
RAMA_56201_09011281823046_0003047905_0028098902_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (1MB) | Preview |
Text
RAMA_56201_09011281823046_0003047905_0028098902_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (843kB) | Request a copy |
|
Text
RAMA_56201_09011281823046_0003047905_0028098902_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_56201_09011281823046_0003047905_0028098902_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_56201_09011281823046_0003047905_0028098902_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (209kB) | Request a copy |
|
Text
RAMA_56201_09011281823046_0003047905_0028098902_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (337kB) | Request a copy |
|
Text
RAMA_56201_09011281823046_0003047905_0028098902_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_56201_09011281823046_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (6MB) | Request a copy |
Abstract
K-Means clustering is a method to grouping data based on the similarity of features and detect the hidden patterns in dataset. The dataset is from GARUDA Repository which contains raw data of PDF files. GARUDA dataset extraction process used static analysis method. The data extraction process produced twenty�one features using PDFiD. GARUDA dataset has a multi-class and imbalanced data, therefore a SMOTE process is required. K-Means succeed to grouping 3 clusters with silhouette score is 0,71311. A best validation result is using K-Means label and support with Logistic Regression model at 5-Fold. The accuracy of K�Means label is 94,66%, hence K-Means labeling is better than GARUDA labeling that only obtained the accuracy of 87,16%.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | PDF Malware, Static Analysis, SMOTE, Clustering, K-Means, Silhouette Score, Stratified K-Fold |
Subjects: | Q Science > Q Science (General) > Q300-390 Cybernetics > Q325.5 Machine learning Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation. Q Science > QA Mathematics > QA8.9-QA10.3 Computer science. Artificial intelligence. Computational complexity. Data structures (Computer scienc. Mathematical Logic and Formal Languages T Technology > T Technology (General) > T1-995 Technology (General) |
Divisions: | 09-Faculty of Computer Science > 56201-Computer Systems (S1) |
Depositing User: | Indah Cahya Resti |
Date Deposited: | 17 Jan 2023 04:31 |
Last Modified: | 17 Jan 2023 04:31 |
URI: | http://repository.unsri.ac.id/id/eprint/86535 |
Actions (login required)
View Item |