KLASIFIKASI SPAM PADA EMAIL MENGGUNAKAN METODE SUPPORT VECTOR MACHINE DAN DETEKSI ANOMALY

AISYAH, SITI and Stiawan, Deris and Ubaya, Huda (2020) KLASIFIKASI SPAM PADA EMAIL MENGGUNAKAN METODE SUPPORT VECTOR MACHINE DAN DETEKSI ANOMALY. Undergraduate thesis, Sriwijaya University.

[img] Text
RAMA_56201_09011181621024.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (5MB) | Request a copy
[img] Text
RAMA_56201_09011181621024_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (10MB) | Request a copy
[img]
Preview
Text
RAMA_56201_09011181621024_0003047905_0216068101_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (5MB) | Preview
[img] Text
RAMA_56201_09011181621024_0003047905_0216068101_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[img] Text
RAMA_56201_09011181621024_0003047905_0216068101_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[img] Text
RAMA_56201_09011181621024_0003047905_0216068101_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (7MB) | Request a copy
[img] Text
RAMA_56201_09011181621024_0003047905_0216068101_05.pdf - Accepted Version
Restricted to Repository staff only

Download (374kB) | Request a copy
[img] Text
RAMA_56201_09011181621024_0003047905_0216068101_06_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (901kB) | Request a copy
[img] Text
RAMA_56201_09011181621024_0003047905_0216068101_07_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy

Abstract

Email is a written communication tool commonly used in everyday life. The problem with e mail is spam. This study includes a machine learning approach, Support Vector Machine, which is used for spam classification on e-mail. Using two datasets, data that has not been vectorized and data that has been vectorized. For data that has not been vectorized, the first step taken is processing the text so that the data becomes numeric. After the data has been vectorized, the next step for these two data is to detect anomalies using Isolation Forest for removal of outliers in the data. The next step will be the data resampling using SMOTE so that the data becomes balanced. Then the last step is classification using the Support Vector Machine method by sharing data using K-Fold Cross Validation and normalizing using Min Max Scaler. In the research the best validation value for the Emails dataset obtained an average accuracy value of 96.80%, Recall 98.70%, Precision 95.12%, F1 Score 96.88%, FPR 5.11%, AUC 96.79%, Error 3.19%. The best validation values for the Spambase dataset obtained an average accuracy value of 94.08%, Recall 92.55%, Precision 95.31%, F1 Score 93.91%, FPR 4.42%, AUC 94.06%, Error 5 91%. Based on the results, it means that the method used in spam classification on e-mail is the right method

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Spam Email, Support Vector Machine, Isolation Forest, SMOTE, Klasifikasi
Subjects: Q Science > Q Science (General) > Q300-390 Cybernetics > Q325.5 Machine learning
Divisions: 09-Faculty of Computer Science > 56201-Computer Systems (S1)
Depositing User: Siti Aisyah
Date Deposited: 03 Aug 2020 07:34
Last Modified: 03 Aug 2020 07:34
URI: http://repository.unsri.ac.id/id/eprint/32024

Actions (login required)

View Item View Item