Ermatita, Ermatita (2024) similarity Proposed threshold-based and rule-based approaches to detecting duplicates in bibliographic database. Turnitin Universitas Sriwijaya. (Submitted)
Text
Similarity Proposed Treshold based and rule base.pdf Download (3MB) |
Abstract
Bibliographic databases are used to measure the performance of researchers, universities and research institutions. Thus, high data quality is required and data duplication is avoided. One of the weaknesses of the threshold-based approach in duplication detection is the low accuracy level. Therefore, another approach is required to improve duplication detection. This study proposes a method that combines threshold-based and rule-based approaches to perform duplication detection. These two approaches are implemented in the comparison stage. The cosine similarity function is used to create weight vectors from the features. Then, the comparison operator is used to determine whether the pair of records are grouped as duplication or not. Three research databases: Web of Science (WoS), Scopus, and Google Scholar (GS) on the Science and Technology Index (SINTA) database are investigated. Rule 4 and Rule 5 provide the best performance. For WoS dataset, the accuracy, precision, recall, and F1-measure values were 100.00%. For Scopus dataset, the accuracy and precision values were 100.00%, recall: 98.00%, and the F1-measure value is 98.00%. For GS dataset, the accuracy value was 100.00%, precision: 99.00%, recall: 97.00%, and the F1-measure value is 98.00%. The proposed method is potential tool for accurate detection on duplication records in publication databases.
Item Type: | Other |
---|---|
Subjects: | #3 Repository of Lecturer Academic Credit Systems (TPAK) > Results of Ithenticate Plagiarism and Similarity Checker |
Divisions: | 09-Faculty of Computer Science > 55101-Informatics (S2) |
Depositing User: | Dr Ermatita zuhairi |
Date Deposited: | 25 Jun 2024 05:47 |
Last Modified: | 25 Jun 2024 05:47 |
URI: | http://repository.unsri.ac.id/id/eprint/147559 |
Actions (login required)
View Item |