PEMODELAN TOPIK MENGGUNAKAN PRE-TRAINED LANGUAGE MODEL ROBERTA DAN VARIATIONAL AUTOENCODER

MUWAFA, FADHIL ZAHRAN and Yusliani, Novi and Rachmatullah, Muhammad Naufal (2024) PEMODELAN TOPIK MENGGUNAKAN PRE-TRAINED LANGUAGE MODEL ROBERTA DAN VARIATIONAL AUTOENCODER. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021282025077.pdf] Text
RAMA_55201_09021282025077.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_TURNITIN.pdf] Text
RAMA_55201_09021282025077_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (6MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_01_front_ref.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (9MB)
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_02.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (8MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_03.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (2MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_04.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (11MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_05.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (6MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_06.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (721kB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_07_ref.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_07_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_55201_09021282025077_0008118205_0001129204_08_lamp.pdf] Text
RAMA_55201_09021282025077_0008118205_0001129204_08_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy

Abstract

The rapid and widespread flow of information highlights the importance of efficient text data management, making it even more important to organize and classify information from text data as more news is published online all the time. Topic modeling is useful in clustering news texts from the ever-growing sea of online news based on the topic of each text data. One method of topic modeling is to use Variational Autoencoder combined with a trained language model, RoBERTa. This research aims to create a topic modeling system using the Pre-trained Language Model RoBERTa and Variational Autoencoder. The dataset used consists of 5000 news data with 10 different topics taken from cnnindonesia, kompas, and detik.com. Topic modeling evaluation is done using coherence score cv, homogeneity score, and v-measure. With a coherence score cv of 77.3%, homogeneity score of 6.5%, and v-measure of 7.1%.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Pemodelan Topik, Variational Autoencoder, Pre-trained Language Model, RoBERTa, Coherence Score cv, Homogeneity Score, V-Measure
Subjects: T Technology > T Technology (General) > T1-995 Technology (General)
T Technology > T Technology (General) > T58.5-58.64 Information technology > T58.6.E9 Management information systems -- Congresses.
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Fadhil Zahran Muwafa
Date Deposited: 25 Apr 2024 08:30
Last Modified: 25 Apr 2024 08:30
URI: http://repository.unsri.ac.id/id/eprint/143437

Actions (login required)

View Item View Item