PEMODELAN TOPIK MENGGUNAKAN BERTOPIC DENGAN KEYBERT UNTUK EKSTRAKSI KATA KUNCI SEBAGAI TOPIC REPRESENTATION TUNING

PURBA, ERLANGGA NICHOLAS and Yusliani, Novi and Primanita, Anggina (2024) PEMODELAN TOPIK MENGGUNAKAN BERTOPIC DENGAN KEYBERT UNTUK EKSTRAKSI KATA KUNCI SEBAGAI TOPIC REPRESENTATION TUNING. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021282025049.pdf] Text
RAMA_55201_09021282025049.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_TURNITIN.pdf] Text
RAMA_55201_09021282025049_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (4MB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_01_front_ref.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (1MB)
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_02.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (961kB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_03.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (701kB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_04.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_05.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (493kB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_06.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (458kB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_07_ref.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_07_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (326kB) | Request a copy
[thumbnail of RAMA_55201_09021282025049_0008118205_0206088901_08_lamp.pdf] Text
RAMA_55201_09021282025049_0008118205_0206088901_08_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (417kB) | Request a copy

Abstract

In the era of technological advancement, the use of social media such as Twitter has become commonplace as a medium for online interaction. Every day, a vast number of tweets are generated by Twitter users from around the world. To determine which topics are trending, reading all the tweets on Twitter would take an extremely long time due to the sheer volume of tweets. One method used to efficiently extract information from Twitter tweets is topic modeling. Topic modeling is a method for discovering topics from various texts. This research aims to perform topic modeling on Indonesian-language tweets using BERTopic with KeyBERT for keyword extraction in each topic. KeyBERT will generate keywords for each topic cluster and will be used by BERTopic to enrich the results of the topic modeling. The dataset used consists of 10,000 Indonesian language tweets taken from the Twitter account @detikcom. The data is divided into two parts: 8,000 tweets are used for training data and 2,000 tweets are used for testing. Based on the topic modeling results with BERTopic, a total of 50 topics were obtained. Topic Modeling evaluation was conducted using coherence score, yielding an average of 0.765 on the training data and 0.675 on the testing data.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Pemodelan Topik, BERTopic, KeyBERT,Coherence Score, Twitter
Subjects: Q Science > QA Mathematics > QA75-76.95 Calculating machines > QA76.9.B45 Big data. Machine learning. Quantitative research. Metaheuristics.
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Erlangga Nicholas Purba
Date Deposited: 24 Jun 2024 07:24
Last Modified: 24 Jun 2024 07:24
URI: http://repository.unsri.ac.id/id/eprint/147741

Actions (login required)

View Item View Item