FINE-TUNING INDOBERT UNTUK KLASIFIKASI KATEGORI BERITA BERBAHASA INDONESIA

FITRIYANI, KIAGUS MUHAMMAD EFAN and Yusliani, Novi and Marieska, Mastura Diana (2024) FINE-TUNING INDOBERT UNTUK KLASIFIKASI KATEGORI BERITA BERBAHASA INDONESIA. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021282126039.pdf] Text
RAMA_55201_09021282126039.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_TURNITIN.pdf] Text
RAMA_55201_09021282126039_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (6MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_01_front_ref.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (6MB)
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_02.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (4MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_03.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_04.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (7MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_05.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (5MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_06.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (304kB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_07_ref.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_07_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_55201_09021282126039_0008118205_0021038607_08_lamp.pdf] Text
RAMA_55201_09021282126039_0008118205_0021038607_08_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy

Abstract

The availability of Indonesian news articles on the internet has greatly increased, making it more challenging to recognize and categorize news accurately. Therefore, a solution to this issue is to develop a classification system for Indonesian news article categories. This research aims to classify Indonesian news category using fine-tuning on the pre-trained IndoBERT model. The dataset consists of 31,993 articles divided into five news categories: education, health, technology, sports, and automotive. Articles were collected from two of the largest and most trusted online news portals, kompas.com and detik.com, using web scraping method. The fine-tuning process was divided into 8 scenarios, which are combinations of dataset type configurations, learning rate, and batch size. Based on the test results, the highest accuracy was obtained in scenario 2, where the model trained with a learning rate of 2e-5 and batch size of 32, reaching an accuracy of 98.37%.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: News Classification, Pre-trained Model, IndoBERT, Fine-tuning, Web Scraping, Accuracy
Subjects: Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation.
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Kiagus Muhammad Efan Fitriyan
Date Deposited: 07 Jan 2025 01:56
Last Modified: 07 Jan 2025 01:56
URI: http://repository.unsri.ac.id/id/eprint/162736

Actions (login required)

View Item View Item