KHAIRULLAH, BARIQ and Abdiansah, Abdiansah (2025) IMPLEMENTASI MULTI-LABEL TEXT CLASSIFICATION MENGGUNAKAN INDOBERT UNTUK KLASIFIKASI GENRE FILM BERDASARKAN SINOPSIS BERBAHASA INDONESIA. Undergraduate thesis, Sriwijaya University.
![]() ![]() Preview |
Image
RAMA_55201_09021282126094_cover.jpg - Cover Image Available under License Creative Commons Public Domain Dedication. Download (114kB) | Preview |
![]() |
Text
RAMA_55201_09021282126094.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (7MB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (15MB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (612kB) |
![]() |
Text
RAMA_55201_09021282126094_0001108401_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (4MB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (591kB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_06.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (196kB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_07_ref.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (260kB) | Request a copy |
![]() |
Text
RAMA_55201_09021282126094_0001108401_08_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (189kB) | Request a copy |
Abstract
This research is motivated by the rapid growth of the Indonesian film industry and the increasing need for an accurate film genre classification system to help viewers understand film content more precisely. Synopsis is often the main reference in identifying film genres, but the complexity and diversity of languages make the classification process a challenge. To answer this challenge, this study carries IndoBERT as the main method in classifying film genres in a multi-label manner based on Indonesian synopses. IndoBERT was chosen because of its ability to understand the context and structure of the Indonesian language in depth. The dataset used consists of 1,738 films with five main genres: Drama, Comedy, Horror, Action, and Romance. As a complement, this study also applies two non-core optimization techniques, namely Dynamic Thresholding and Per-class Performance Tracking, to overcome data imbalances between genres and improve prediction accuracy. As a result, the IndoBERT model managed to achieve an accuracy of 79.23% and a Macro F1-score of 57.15%, an increase of 6.21% from the initial baseline of 73.03%. The horror genre recorded the best performance among the other genres with an accuracy of 88.89% and an F1-score of 77.17%, supported by distinctive and consistent linguistic features. The model was integrated into a Streamlit-based web application with an average response time of under 0.25 seconds, proving that IndoBERT is effective and efficient for movie genre classification in real-world scenarios.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | IndoBERT, Multi-label Classification, Genre Film, Natural Language Processing, dan Sinopsis Film. |
Subjects: | Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation. |
Divisions: | 09-Faculty of Computer Science > 55201-Informatics (S1) |
Depositing User: | Bariq Khairullah |
Date Deposited: | 16 May 2025 01:42 |
Last Modified: | 16 May 2025 01:42 |
URI: | http://repository.unsri.ac.id/id/eprint/172178 |
Actions (login required)
![]() |
View Item |