PENGGUNAAN ALGORITMA DOUBLE METAPHONE UNTUK MENGOREKSI EJAAN KATA BERBAHASA INDONESIA

DAELI, ROBERTO DERMAN and Yusliani, Novi and Miraswan, Kanda Januar (2020) PENGGUNAAN ALGORITMA DOUBLE METAPHONE UNTUK MENGOREKSI EJAAN KATA BERBAHASA INDONESIA. Undergraduate thesis, Sriwijaya University.

[thumbnail of RAMA_55201_09021381320044.pdf] Text
RAMA_55201_09021381320044.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (5MB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_TURNITIN.pdf] Text
RAMA_55201_09021381320044_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (11MB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_01_front_ref.pdf]
Preview
Text
RAMA_55201_09021381320044_0008118205_0009019002_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (2MB) | Preview
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_02.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (632kB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_03.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (850kB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_04.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_05.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (779kB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_06.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (190kB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_06_ref.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_06_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (187kB) | Request a copy
[thumbnail of RAMA_55201_09021381320044_0008118205_0009019002_07_lamp.pdf] Text
RAMA_55201_09021381320044_0008118205_0009019002_07_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (395kB) | Request a copy

Abstract

Writing errors are often found in documents/texts in Indonesian. One of the reasons is the writer's ignorance of the correct spelling of the word. In this research, phonetic string matching method with the double metaphone algorithm was chosen as a solution for spelling correction. Before the string matching process, the text must go through a stage of pre-processing using tokenizing and case folding. The double metaphone algorithm is carried out a series of processes, namely transforming words that have gone through pre-processing into code (primary and secondary) and performing string matching on typo word to get some word suggestions. The last stage is the process of weighting the dice similarity on the suggested words to obtain the highest similarity results. In this research, using a data dictionary accompanied by a double metaphone code and 400 words of test data for testing the four typo categories, namely addition, deletion, substitution, and transposition. The test results of the software are able to produce an accuracy of 81,25%.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: pengoreksian ejaan, pencocokan string, phonetic string matching, algoritma double metaphone, pre-processing, case folding, tokenizing, dan dice similarity.
Subjects: P Language and Literature > P Philology. Linguistics > P98-98.5 Computational linguistics. Natural language processing
Divisions: 09-Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Users 7515 not found.
Date Deposited: 19 Aug 2020 07:59
Last Modified: 19 Aug 2020 07:59
URI: http://repository.unsri.ac.id/id/eprint/33367

Actions (login required)

View Item View Item