CLINICAL NAMED ENTITY RECOGNITION MODEL BERBASIS TRANSFORMER UNTUK DATA BIOMEDIS

PUTRI, INDAH GALA and Tutuko, Bambang and Firdaus, Firdaus (2025) CLINICAL NAMED ENTITY RECOGNITION MODEL BERBASIS TRANSFORMER UNTUK DATA BIOMEDIS. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_56201_09011182126033_cover.jpg]
Preview
Image
RAMA_56201_09011182126033_cover.jpg - Cover Image
Available under License Creative Commons Public Domain Dedication.

Download (266kB) | Preview
[thumbnail of RAMA_56201_09011182126033.pdf] Text
RAMA_56201_09011182126033.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (9MB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_TURNITIN.pdf] Text
RAMA_56201_09011182126033_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (10MB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_01_front_ref.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (754kB)
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_02.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_03.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (2MB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_04.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (4MB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_05.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (206kB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_06_ref.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_06_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (341kB) | Request a copy
[thumbnail of RAMA_56201_09011182126033_0012016003_0221017801_07_lamp.pdf] Text
RAMA_56201_09011182126033_0012016003_0221017801_07_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy

Abstract

Clinical Named Entity Recognition (CNER) is a critical task in natural language processing (NLP) aimed at extracting medical entities from complex biomedical texts. The main challenges in this task lie in the complexity of sentence structures and the highly variable medical terminology. This study focuses on the development and evaluation of CNER models based on the Transformer architecture, specifically BERT, to improve understanding and accuracy in recognizing medical entities from biomedical data. Two BERT-Base models were developed in this research: EMR-BERT and PubMed2M-BERT. EMR-BERT is a customized model with eight encoder layers trained directly through fine-tuning. In contrast, PubMed2M-BERT is a continuation pre-training of BERT-Base Uncased using the Masked Language Modeling (MLM) objective without Next Sentence Prediction (NSP) on the ViPubMed biomedical corpus. The pre-training results showed a perplexity score of 2.964 and a stable loss curve. During the fine-tuning phase, PubMed2M-BERT achieved the highest F1-score of 92% on the NCBI-disease dataset, outperforming EMR-BERT, which achieved 87%. These findings demonstrate that domain-specific pre-training can significantly enhance the performance of Transformer models in CNER tasks on biomedical data.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Clinical Named Entity Recognition, Transformer, Pre-Training, Fine-Tuning, Biomedis
Subjects: Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation.
Divisions: 09-Faculty of Computer Science > 56201-Computer Systems (S1)
Depositing User: Indah Gala Putri
Date Deposited: 20 Jun 2025 04:16
Last Modified: 20 Jun 2025 04:16
URI: http://repository.unsri.ac.id/id/eprint/175807

Actions (login required)

View Item View Item