CLINICAL NAMED ENTITY RECOGNITION MODEL BERBASIS TRANSFORMER UNTUK DATA BIOMEDIS

PUTRI, INDAH GALA and Tutuko, Bambang and Firdaus, Firdaus (2025) CLINICAL NAMED ENTITY RECOGNITION MODEL BERBASIS TRANSFORMER UNTUK DATA BIOMEDIS. Undergraduate thesis, Sriwijaya University.

Preview	Image RAMA_56201_09011182126033_cover.jpg - Cover Image Available under License Creative Commons Public Domain Dedication. Download (266kB) \| Preview
	Text RAMA_56201_09011182126033.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (9MB) \| Request a copy
	Text RAMA_56201_09011182126033_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (10MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (754kB)
	Text RAMA_56201_09011182126033_0012016003_0221017801_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (2MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (4MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (206kB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (341kB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) \| Request a copy

Abstract

Clinical Named Entity Recognition (CNER) is a critical task in natural language processing (NLP) aimed at extracting medical entities from complex biomedical texts. The main challenges in this task lie in the complexity of sentence structures and the highly variable medical terminology. This study focuses on the development and evaluation of CNER models based on the Transformer architecture, specifically BERT, to improve understanding and accuracy in recognizing medical entities from biomedical data. Two BERT-Base models were developed in this research: EMR-BERT and PubMed2M-BERT. EMR-BERT is a customized model with eight encoder layers trained directly through fine-tuning. In contrast, PubMed2M-BERT is a continuation pre-training of BERT-Base Uncased using the Masked Language Modeling (MLM) objective without Next Sentence Prediction (NSP) on the ViPubMed biomedical corpus. The pre-training results showed a perplexity score of 2.964 and a stable loss curve. During the fine-tuning phase, PubMed2M-BERT achieved the highest F1-score of 92% on the NCBI-disease dataset, outperforming EMR-BERT, which achieved 87%. These findings demonstrate that domain-specific pre-training can significantly enhance the performance of Transformer models in CNER tasks on biomedical data.

Item Type:	Thesis (Undergraduate)
Uncontrolled Keywords:	Clinical Named Entity Recognition, Transformer, Pre-Training, Fine-Tuning, Biomedis
Subjects:	Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation.
Divisions:	09-Faculty of Computer Science > 56201-Computer Systems (S1)
Depositing User:	Indah Gala Putri
Date Deposited:	20 Jun 2025 04:16
Last Modified:	20 Jun 2025 04:16
URI:	http://repository.unsri.ac.id/id/eprint/175807

Actions (login required)

View Item

Preview	Image RAMA_56201_09011182126033_cover.jpg - Cover Image Available under License Creative Commons Public Domain Dedication. Download (266kB) \| Preview
	Text RAMA_56201_09011182126033.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (9MB) \| Request a copy
	Text RAMA_56201_09011182126033_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (10MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (754kB)
	Text RAMA_56201_09011182126033_0012016003_0221017801_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (2MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (4MB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (206kB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (341kB) \| Request a copy
	Text RAMA_56201_09011182126033_0012016003_0221017801_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (1MB) \| Request a copy