CLINICAL NAMED ENTITY RECOGNITION PADA DATA BIOMEDIS MENGGUNAKAN PRE-TRAINED WORD EMBEDDINGS DAN DEEP LEARNING

APRIADI, MUHAMMAD AZRIEL and Firdaus, Firdaus (2025) CLINICAL NAMED ENTITY RECOGNITION PADA DATA BIOMEDIS MENGGUNAKAN PRE-TRAINED WORD EMBEDDINGS DAN DEEP LEARNING. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_56201_09011282126078_cover.jpg]
Preview
Image
RAMA_56201_09011282126078_cover.jpg - Cover Image
Available under License Creative Commons Public Domain Dedication.

Download (241kB) | Preview
[thumbnail of RAMA_56201_09011282126078.pdf] Text
RAMA_56201_09011282126078.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (7MB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_TURNITIN.pdf] Text
RAMA_56201_09011282126078_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (9MB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_0221017801_01_front_ref.pdf] Text
RAMA_56201_09011282126078_0221017801_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (586kB)
[thumbnail of RAMA_56201_09011282126078_0221017801_02.pdf] Text
RAMA_56201_09011282126078_0221017801_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (820kB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_0221017801_03.pdf] Text
RAMA_56201_09011282126078_0221017801_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_0221017801_04.pdf] Text
RAMA_56201_09011282126078_0221017801_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (4MB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_0221017801_05.pdf] Text
RAMA_56201_09011282126078_0221017801_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (215kB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_0221017801_06_ref.pdf] Text
RAMA_56201_09011282126078_0221017801_06_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (195kB) | Request a copy
[thumbnail of RAMA_56201_09011282126078_0221017801_07_lamp.pdf] Text
RAMA_56201_09011282126078_0221017801_07_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (404kB) | Request a copy

Abstract

The rapid growth of digital biomedical data has posed significant challenges in managing and extracting information from unstructured medical texts. This study aims to develop and evaluate a Clinical Named Entity Recognition (CNER) model by combining pre-trained word embeddings with deep learning architectures. Three biomedical datasets were used: JNLPBA, NCBI-Disease, and BC2GM. The experiments were conducted in two stages: the first stage compared the performance of GloVe-BiLSTM, ELMo-BiLSTM, and BERT-BiLSTM combinations; the second stage evaluated BERT-BiLSTM and PubMed2MBERTBiLSTM models using fine-tuning and early stopping strategies. Evaluation using macro average precision, recall, and F1-Score shows that contextual embeddings consistently outperform static embeddings, with GloVe yielding the lowest performance. Transformer-based models like BERT and PubMed2MBERT outperform ELMo due to their self-attention mechanism that better captures token relationships. PubMed2MBERT-BiLSTM, pretrained in the biomedical domain, achieved the best performance across all datasets, highlighting the effectiveness of domain-specific models in medical entity recognition.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Clinical Named Entity Recognition, Deep Learning, Pre-trained Word Embeddings, Biomedical Text, GloVe, ELMo, BERT, PubMed2MBERT,Transformer, BiLSTM
Subjects: Q Science > Q Science (General) > Q334-342 Computer science. Artificial intelligence. Algorithms. Robotics. Automation.
Divisions: 09-Faculty of Computer Science > 56201-Computer Systems (S1)
Depositing User: Muhammad Azriel Apriadi
Date Deposited: 20 Jun 2025 04:20
Last Modified: 20 Jun 2025 04:20
URI: http://repository.unsri.ac.id/id/eprint/175811

Actions (login required)

View Item View Item