KLASIFIKASI UJARAN KEBENCIAN MULTI LABEL MENGGUNAKAN ARSITEKTUR LONG SHORT TERM MEMORY DAN TRANSFORMER DENGAN ARSITEKTUR BACK TRANSLATION DAN BERT

PRATIWI, PUTRI and Desiani, Anita and Suprihatin, Bambang (2025) KLASIFIKASI UJARAN KEBENCIAN MULTI LABEL MENGGUNAKAN ARSITEKTUR LONG SHORT TERM MEMORY DAN TRANSFORMER DENGAN ARSITEKTUR BACK TRANSLATION DAN BERT. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_44201_08011282126028_COVER.jpg]
Preview
Image
RAMA_44201_08011282126028_COVER.jpg - Cover Image
Available under License Creative Commons Public Domain Dedication.

Download (377kB) | Preview
[thumbnail of RAMA_44201_08011282126028.pdf] Text
RAMA_44201_08011282126028.pdf

Download (8MB)
[thumbnail of RAMA_44201_08011282126028_TURNITIN.pdf] Text
RAMA_44201_08011282126028_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (11MB) | Request a copy
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_01_front_ref.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (2MB)
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_02.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (527kB) | Request a copy
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_03.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (527kB) | Request a copy
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_04.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (3MB) | Request a copy
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_05.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_05.pdf - Accepted Version
Restricted to Repository staff only

Download (154kB) | Request a copy
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_06_ref.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_06_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (452kB) | Request a copy
[thumbnail of RAMA_44201_08011282126028_0011127702_0026017102_07_lamp.pdf] Text
RAMA_44201_08011282126028_0011127702_0026017102_07_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (124kB) | Request a copy

Abstract

Multi-label hate speech can be categorized by more than one label at once, such as individual, religious, racial hate speech, with different levels of severity. In the multi-label context, class and label are used to determine the categories in the data. Labels refer to the types of hate speech found in the data, while class refers to the combination of the various labels. One of the labeling methods that can be used to determine the labels in the data is label powerset. Multi-label hate speech spreads quickly and widely, so automatic early detection is needed. This research uses a combination of classification architecture and augmentation techniques. The classification architecture used is Long Short Term Memory (LSTM) to process the sequence of important information and Transformer to understand the context globally. The combination of architectures requires a large amount of data. The method used to multiply the data is to perform augmentation using the back translation method and Bidirectional Encoder Representation from Transformer (BERT). The results show an increase in the amount of data up to three times. The combination of architectures has good model performance. The accuracy result of 86% shows that the model is able to predict the class correctly as a whole. The precision result of 86% shows that the model can identify positive classes with a low error rate. The recall result of 85% shows that the model can detect most of the positive classes from the available data. The f1-score result of 85% shows that the model is consistent in classifying positive classes. In each class, the model performed very well in detecting the H0 and H26 classes with accuracy, precision, recall, and f1-score results of more than 90% each. In the H17, H23, and H31 classes, the model still struggled with classification. This research shows that the combination of LSTM and Transformer architecture as well as the combination of back translation augmentation and BERT can be used for multi-label hate speech classification. For further research, the balance of data between classes needs to be considered so that the model's performance is more optimal.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: hate speech, back translation, Bidirectional Encoder Representation from Transformer (BERT), Long Short Term Memory (LSTM), Transformer
Subjects: Q Science > QA Mathematics > QA299.6-433 Analysis > Q334.A755 Artificial intelligence. Computational linguistics. Computer science.
Divisions: 08-Faculty of Mathematics and Natural Science > 44201-Mathematics (S1)
Depositing User: Putri Pratiwi
Date Deposited: 21 Mar 2025 06:28
Last Modified: 21 Mar 2025 06:28
URI: http://repository.unsri.ac.id/id/eprint/169737

Actions (login required)

View Item View Item