PENGARUH QUERY EXPANSION TERHADAP PENDETEKSIAN KEMIRIPAN TEKS MENGGUNAKAN COSINE SIMILARITY

SARI, PIPIT KURNIA and Yusliani, Novi and Yunita, Yunita (2019) PENGARUH QUERY EXPANSION TERHADAP PENDETEKSIAN KEMIRIPAN TEKS MENGGUNAKAN COSINE SIMILARITY. Undergraduate thesis, Sriwijaya University.

[img]
Preview
Text
RAMA_55201_09021181520025_0008118205_0006068305_01_front_ref.pdf - Accepted Version
Available under License Creative Commons Public Domain Dedication.

Download (625kB) | Preview
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_02.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (321kB) | Request a copy
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_03.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (805kB) | Request a copy
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_04.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (1MB) | Request a copy
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_05.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (979kB) | Request a copy
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_06.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (14kB) | Request a copy
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_07_ref.pdf - Bibliography
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (225kB) | Request a copy
[img] Text
RAMA_55201_09021181520025_0008118205_0006068305_08_lamp.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (552kB) | Request a copy
[img] Text
RAMA_55201_09021181520025.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (2MB) | Request a copy
[img] Text
RAMA_55201_09021181520025_TURNITIN.pdf - Accepted Version
Restricted to Repository staff only
Available under License Creative Commons Public Domain Dedication.

Download (21MB) | Request a copy

Abstract

Cosine Similarity is a method of calculating the similarity of text that depends on the same word as the word being tested. If the word in the test text is not the same as the word in the source text, then the word does not match the word in the word list and the word cannot be counted. This research examines the effect of query expansion using a thesaurus, which is one algorithm to improve the effectiveness of a word list match. Cosine similarity algorithm with query expansion or without query expansion each tested with 7 source documents and 21 comparison documents. Based on cosine similarity evaluation results with query expansion can improve the detection of text similarity compared to the cosine similarity algorithm without query expansion, which is a percentage value of 46.90%, on data without query expansion and 43.11% for window size 2, 42.90 % for window size 3, 42.59% for window size 4. Although it can increase overall computing time, however, the term obtained from forming query expansion makes the text similarity better. Keywords: Expansion Query, Thesaurus, Cosine Similarity, Window Size.

Item Type: Thesis (Undergraduate)
Uncontrolled Keywords: Expansion Query, Thesaurus, Cosine Similarity, Window Size.
Subjects: P Language and Literature > P Philology. Linguistics > P98-98.5 Computational linguistics. Natural language processing
Divisions: Faculty of Computer Science > 55201-Informatics (S1)
Depositing User: Pipit Kurnia Sari
Date Deposited: 26 Sep 2019 03:30
Last Modified: 26 Sep 2019 03:30
URI: http://repository.unsri.ac.id/id/eprint/8925

Actions (login required)

View Item View Item