GARCIA, LOUIS and Abdiansah, Abdiansah (2025) SISTEM TANYA JAWAB DETEKSI KANKER DINI MENGGUNAKAN METODE BERT DAN TF-IDF. Undergraduate thesis, Sriwijaya University.
![]() ![]() Preview |
Image
RAMA_55201_09021182126006_Cover.jpg - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (476kB) | Preview |
![]() |
Text
RAMA_55201_09021182126006.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (4MB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (7MB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_0001108401_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (1MB) |
![]() |
Text
RAMA_55201_09021182126006_0001108401_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (558kB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_0001108401_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (388kB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_0001108401_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (454kB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_0001108401_05.pdf - Submitted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_0001108401_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (223kB) | Request a copy |
![]() |
Text
RAMA_55201_09021182126006_0001108401_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (565kB) | Request a copy |
Abstract
The increasing mortality rate due to cancer, particularly in developing countries like Indonesia, highlights the urgency of developing an effective question-answering detection system. According to data from Globocan 2020, Indonesia recorded 396,914 new cancer cases with 234,511 cancer-related deaths. Additionally, Riskesdas data shows that the prevalence of cancer in Indonesia increased from 1.4 per 1,000 population in 2013 to 1.79 per 1,000 population in 2018. This study aims to develop a cancer early detection question-answering system using BERT (Bidirectional Encoder Representations from Transformers) and TF-IDF (Term Frequency-Inverse Document Frequency) methods. The combination of these two methods is expected to improve the accuracy in understanding cancer symptoms, diagnosis, and treatment. The system was tested using a dataset from Kaggle containing clinical data on various types of cancer, with preprocessing techniques such as case folding, stop word removal, stemming, and tokenization applied to enhance data quality. The system’s performance evaluation showed the highest accuracy of 98.85%, achieved with a fine-tuned BERT model. In comparison with the BERT-only model (94.70%) and TF-IDF-only model (96.55%), these results demonstrate that the integration of BERT and TF-IDF is more effective in providing accurate and relevant responses. This study also involved interviews with 10 medical students from Universitas Sriwijaya, class of 2021-2022, to test the validity of the system. Of the 20 questions asked, the system successfully answered 19 correctly, resulting in an accuracy of 95%. The findings of this study contribute to the development of artificial intelligence (AI)-based health technology and support early cancer detection efforts in Indonesia by providing an efficient and reliable cancer detection system.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | BERT, TF-IDF, NLP, Model, Berbasis AI, Sistem, Kanker |
Subjects: | T Technology > T Technology (General) > T58.5-58.64 Information technology > T58.5 General works Management information systems Cf. HD30.213 Industrial management Cf. HF5549.5.C6+ Communication in personnel management Cf. TS158.6 Automatic data collection systems (Production control) |
Divisions: | 09-Faculty of Computer Science > 55201-Informatics (S1) |
Depositing User: | Louis Garcia |
Date Deposited: | 14 Mar 2025 07:32 |
Last Modified: | 14 Mar 2025 07:32 |
URI: | http://repository.unsri.ac.id/id/eprint/168832 |
Actions (login required)
![]() |
View Item |