NAMED ENTITY RECOGNITION MENGGUNAKAN PEMBOBOTAN TERM FREQUENCY - INVERSE DOCUMENT FREQUENCY DAN SUPPORT VECTOR MACHINES

WIARKA, SEPTRI PUTRA and Abdiansah, Abdiansah and Yusliani, Novi (2021) NAMED ENTITY RECOGNITION MENGGUNAKAN PEMBOBOTAN TERM FREQUENCY - INVERSE DOCUMENT FREQUENCY DAN SUPPORT VECTOR MACHINES. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021381621129.pdf] Text
RAMA_55201_09021381621129.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (9MB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_TURNITIN.pdf] Text
RAMA_55201_09021381621129_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (9MB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_01_front_ref.pdf]
Preview
Text
RAMA_55201_09021381621129_0001108401_0008118205_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (6MB) | Preview
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_02.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (656kB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_03.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (441kB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_04.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (2MB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_05.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (533kB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_06.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (153kB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_07_06_ref.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_07_06_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (253kB) | Request a copy
[thumbnail of RAMA_55201_09021381621129_0001108401_0008118205_07_lamp.pdf] Text
RAMA_55201_09021381621129_0001108401_0008118205_07_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (489kB) | Request a copy

Abstract

The sources of information are currently diverse, making it easier for people to get information. However, the main information in the media is not structured so that it must be carefully read so as not to get wrong information. For that we need a tool to extract information. Named Entity Reognition (NER) or the recognition of named entities is a technique in extracting information by recognizing entities that have been determined in a sentence. In this study, NER is used in Indonesian language news texts using weighting term frequency - inverse document frequency (TF-IDF) and support vector machines (SVM) for named entities such as names of people (PER), names of organizations (ORG), names of locations (LOC). ), adverb of time (TIME) and other entities (OTH). The recognition of named entities is done by using the TF-IDF weight feature and the weight of the Part of Speech Tagging (POSTag) for each word. The test was carried out on the Indonesian language news text with a total of 1773 words and the results of the performance scores for accuracy on each entity named OTH, ORG, TIME, PER and LOC each scored 60.55%, 59.85%, 53.12%, 33.01% and 4.35%.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: named entity recognition (NER), information extraction, weighting term frequency - inverse document frequency (TF-IDF), part of speech tagging (POS-Tag), support vector machines (SVM)
Subjects: P Language and Literature > P Philology. Linguistics > P98-98.5 Computational linguistics. Natural language processing
T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK1-9971 Electrical engineering. Electronics. Nuclear engineering > TK1 Electrical engineering--Periodicals. Automatic control--Periodicals. Computer science--Periodicals. Information technology--Periodicals. Automatic control. Computer science. Electrical engineering. Information technology.
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Septri Putra Wiarka
Date Deposited: 28 Jul 2021 07:54
Last Modified: 28 Jul 2021 07:55
URI: http://repository.unsri.ac.id/id/eprint/50784

Actions (login required)

View Item View Item