ANALISA PERBANDINGAN ALGORITMA C4.5 DAN NAIVE BAYES DALAM MELAKUKAN KLASIFIKASI TEKS BERITA

AHMAD, FARIS HARUN and Sazaki, Yoppy and Saputra, Danny Matthew (2019) ANALISA PERBANDINGAN ALGORITMA C4.5 DAN NAIVE BAYES DALAM MELAKUKAN KLASIFIKASI TEKS BERITA. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021281320012.pdf] Text
RAMA_55201_09021281320012.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (7MB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_TURNITIN.pdf] Text
RAMA_55201_09021281320012_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (20MB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_01_front_ref.pdf]
Preview
Text
RAMA_55201_09021281320012_0006067406_ 0010058507_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (4MB) | Preview
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_02.pdf] Text
RAMA_55201_09021281320012_0006067406_ 0010058507_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (354kB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_03.pdf] Text
RAMA_55201_09021281320012_0006067406_ 0010058507_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (724kB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_04.pdf] Text
RAMA_55201_09021281320012_0006067406_ 0010058507_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (2MB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_05.pdf] Text
RAMA_55201_09021281320012_0006067406_ 0010058507_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (185kB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_06.pdf] Text
RAMA_55201_09021281320012_0006067406_ 0010058507_06.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (153kB) | Request a copy
[thumbnail of RAMA_55201_09021281320012_0006067406_ 0010058507_07.pdf] Text
RAMA_55201_09021281320012_0006067406_ 0010058507_07.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (234kB) | Request a copy

Abstract

Classification is one of the data mining techniques used to predict group membership in data instances. Text classification is a branch of classification that classifies a set of documents into automatically assigned categories. C4.5 and Naive Bayes algorithms are two algorithms that are often compared in the classification tasks because both of them have high accuracy, but generally only with the implementation of numeric datasets. In this study the C4.5 and Naive Bayes algorithms use word weighting techniques and pre-processing to finally predict the classes, and then the performace can be compared to see if they still maintain good performance or not. The C4.5 algorithm has threshold, entropy, info, and gain values which has an important role in building a decision tree, and related to the prediction of each document, variations in the gain value, and the frequency of occurrences for each word in the dataset and the key to making tuples in decision tree. While in the Naïve Bayes Algorithm, predictions depend on the posterior value that can be obtained by multiplying all the word weights for each document and comparing them by the training set. Naive Bayes algorithm with a total of 500 training data text documents resulting a high accuracy on 97.4% and an efficient computing time of 98.45 seconds.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: News classifications, Income Value, Posterior, C4.5 Algorithm, Naïve Bayes Algorithm
Subjects: P Language and Literature > P Philology. Linguistics > P98-98.5 Computational linguistics. Natural language processing
Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation.
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Users 2008 not found.
Date Deposited: 23 Sep 2019 07:43
Last Modified: 23 Sep 2019 07:43
URI: http://repository.unsri.ac.id/id/eprint/8594

Actions (login required)

View Item View Item