AISYAH, SITI and Stiawan, Deris and Ubaya, Huda (2020) KLASIFIKASI SPAM PADA EMAIL MENGGUNAKAN METODE SUPPORT VECTOR MACHINE DAN DETEKSI ANOMALY. Undergraduate thesis, Sriwijaya University.
Text
RAMA_56201_09011181621024.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (5MB) | Request a copy |
|
Text
RAMA_56201_09011181621024_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (10MB) | Request a copy |
|
Preview |
Text
RAMA_56201_09011181621024_0003047905_0216068101_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (5MB) | Preview |
Text
RAMA_56201_09011181621024_0003047905_0216068101_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
|
Text
RAMA_56201_09011181621024_0003047905_0216068101_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
|
Text
RAMA_56201_09011181621024_0003047905_0216068101_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (7MB) | Request a copy |
|
Text
RAMA_56201_09011181621024_0003047905_0216068101_05.pdf - Accepted Version Restricted to Repository staff only Download (374kB) | Request a copy |
|
Text
RAMA_56201_09011181621024_0003047905_0216068101_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (901kB) | Request a copy |
|
Text
RAMA_56201_09011181621024_0003047905_0216068101_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
Abstract
Email is a written communication tool commonly used in everyday life. The problem with e mail is spam. This study includes a machine learning approach, Support Vector Machine, which is used for spam classification on e-mail. Using two datasets, data that has not been vectorized and data that has been vectorized. For data that has not been vectorized, the first step taken is processing the text so that the data becomes numeric. After the data has been vectorized, the next step for these two data is to detect anomalies using Isolation Forest for removal of outliers in the data. The next step will be the data resampling using SMOTE so that the data becomes balanced. Then the last step is classification using the Support Vector Machine method by sharing data using K-Fold Cross Validation and normalizing using Min Max Scaler. In the research the best validation value for the Emails dataset obtained an average accuracy value of 96.80%, Recall 98.70%, Precision 95.12%, F1 Score 96.88%, FPR 5.11%, AUC 96.79%, Error 3.19%. The best validation values for the Spambase dataset obtained an average accuracy value of 94.08%, Recall 92.55%, Precision 95.31%, F1 Score 93.91%, FPR 4.42%, AUC 94.06%, Error 5 91%. Based on the results, it means that the method used in spam classification on e-mail is the right method
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | Spam Email, Support Vector Machine, Isolation Forest, SMOTE, Klasifikasi |
Subjects: | Q Science > Q Science (General) > Q300-390 Cybernetics > Q325.5 Machine learning |
Divisions: | 09-Faculty of Computer Science > 56201-Computer Systems (S1) |
Depositing User: | Users 6819 not found. |
Date Deposited: | 03 Aug 2020 07:34 |
Last Modified: | 03 Aug 2020 07:34 |
URI: | http://repository.unsri.ac.id/id/eprint/32024 |
Actions (login required)
View Item |