ANALISIS SENTIMEN REVIEW MOVIE PADA IMDB MENGGUNAKAN METODE SELEKSI FITUR INFORMATION GAIN DAN ALGORITMA SUPPORT VECTOR MACHINE (SVM)

HAFIZH, MUHAMMAD and Utami, Alvi Syahrini and Rizqie, M. Qurhanul (2023) ANALISIS SENTIMEN REVIEW MOVIE PADA IMDB MENGGUNAKAN METODE SELEKSI FITUR INFORMATION GAIN DAN ALGORITMA SUPPORT VECTOR MACHINE (SVM). Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021381924113.pdf] Text
RAMA_55201_09021381924113.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_TURNITIN.pdf] Text
RAMA_55201_09021381924113_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (8MB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_01_front_ref.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (509kB)
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_02.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (269kB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_03.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (301kB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_04.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (730kB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_05.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (169kB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_06.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (9kB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_07_ref.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_07_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (145kB) | Request a copy
[thumbnail of RAMA_55201_09021381924113_0022127804_0203128701_08_lamp.pdf] Text
RAMA_55201_09021381924113_0022127804_0203128701_08_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (555kB) | Request a copy

Abstract

IMDb is a website that provides information about all movies, including user-generated movie reviews. Reviews are identified through textual data in the form of comment text. However, the large number of features in reviews makes the textual data ambiguous, creating difficulties for sentiment analysis. To address this challenge, this research employs the Information Gain feature selection method to reduce the high feature dimensions in sentiment analysis of IMDb movie reviews. The test results indicate that implementing the Information Gain feature selection method within a linear kernel SVM algorithm with a parameter C value of 1 yields the highest performance. The resulting accuracy, precision, recall, and f-measure are 0.88, 0.88, 0.87, and 0.87, respectively. Furthermore, utilizing this feature selection approach reduces the number of features and computation time from 21,989 to 5,869 features and only 0.12 seconds of computation time. In contrast, the use of the SVM algorithm without feature selection resulted in inferior performance with an accuracy of 0.83, precision of 0.84, recall of 0.84, f-measure of 0.83, and a computation time of 2.25 seconds, considering a total of 21989 features. These outcomes indicate that accurate parameter selection and the application of the Information Gain feature selection method can enhance the efficiency, effectiveness, and accuracy of sentiment analysis. This study seeks to enhance methods for sentiment analysis on text data with a large number of features.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Analisis Sentimen, Support Vector Machine (SVM), Information Gain
Subjects: T Technology > T Technology (General) > T1-995 Technology (General) > T15 General works
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Muhammad Hafizh
Date Deposited: 22 Nov 2023 08:30
Last Modified: 22 Nov 2023 08:30
URI: http://repository.unsri.ac.id/id/eprint/130885

Actions (login required)

View Item View Item