FITRIYANI, KIAGUS MUHAMMAD EFAN and Yusliani, Novi and Marieska, Mastura Diana (2024) FINE-TUNING INDOBERT UNTUK KLASIFIKASI KATEGORI BERITA BERBAHASA INDONESIA. Undergraduate thesis, Sriwijaya University.
Text
RAMA_55201_09021282126039.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (6MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (6MB) |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (4MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (7MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (5MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_06.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (304kB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_07_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_55201_09021282126039_0008118205_0021038607_08_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
Abstract
The availability of Indonesian news articles on the internet has greatly increased, making it more challenging to recognize and categorize news accurately. Therefore, a solution to this issue is to develop a classification system for Indonesian news article categories. This research aims to classify Indonesian news category using fine-tuning on the pre-trained IndoBERT model. The dataset consists of 31,993 articles divided into five news categories: education, health, technology, sports, and automotive. Articles were collected from two of the largest and most trusted online news portals, kompas.com and detik.com, using web scraping method. The fine-tuning process was divided into 8 scenarios, which are combinations of dataset type configurations, learning rate, and batch size. Based on the test results, the highest accuracy was obtained in scenario 2, where the model trained with a learning rate of 2e-5 and batch size of 32, reaching an accuracy of 98.37%.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | News Classification, Pre-trained Model, IndoBERT, Fine-tuning, Web Scraping, Accuracy |
Subjects: | Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation. |
Divisions: | 09-Faculty of Computer Science > 55201-Informatics (S1) |
Depositing User: | Kiagus Muhammad Efan Fitriyan |
Date Deposited: | 07 Jan 2025 01:56 |
Last Modified: | 07 Jan 2025 01:56 |
URI: | http://repository.unsri.ac.id/id/eprint/162736 |
Actions (login required)
View Item |