MUWAFA, FADHIL ZAHRAN and Yusliani, Novi and Rachmatullah, Muhammad Naufal (2024) PEMODELAN TOPIK MENGGUNAKAN PRE-TRAINED LANGUAGE MODEL ROBERTA DAN VARIATIONAL AUTOENCODER. Undergraduate thesis, Sriwijaya University.
Text
RAMA_55201_09021282025077.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (6MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (9MB) |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (8MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (2MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (11MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (6MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_06.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (721kB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_07_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
|
Text
RAMA_55201_09021282025077_0008118205_0001129204_08_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (3MB) | Request a copy |
Abstract
The rapid and widespread flow of information highlights the importance of efficient text data management, making it even more important to organize and classify information from text data as more news is published online all the time. Topic modeling is useful in clustering news texts from the ever-growing sea of online news based on the topic of each text data. One method of topic modeling is to use Variational Autoencoder combined with a trained language model, RoBERTa. This research aims to create a topic modeling system using the Pre-trained Language Model RoBERTa and Variational Autoencoder. The dataset used consists of 5000 news data with 10 different topics taken from cnnindonesia, kompas, and detik.com. Topic modeling evaluation is done using coherence score cv, homogeneity score, and v-measure. With a coherence score cv of 77.3%, homogeneity score of 6.5%, and v-measure of 7.1%.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | Pemodelan Topik, Variational Autoencoder, Pre-trained Language Model, RoBERTa, Coherence Score cv, Homogeneity Score, V-Measure |
Subjects: | T Technology > T Technology (General) > T1-995 Technology (General) T Technology > T Technology (General) > T58.5-58.64 Information technology > T58.6.E9 Management information systems -- Congresses. |
Divisions: | 09-Faculty of Computer Science > 55201-Informatics (S1) |
Depositing User: | Fadhil Zahran Muwafa |
Date Deposited: | 25 Apr 2024 08:30 |
Last Modified: | 01 Jul 2024 04:38 |
URI: | http://repository.unsri.ac.id/id/eprint/143437 |
Actions (login required)
View Item |