ZAINUDIN, ZAINUDIN and Heryanto, Ahmad (2023) ANALISA BIG DATA PADA CLUSTER KOMPUTER MENGGUNAKAN KOMPUTASI TERDISTRIBUSI. Undergraduate thesis, Sriwijaya University.
Text
RAMA_56201_09011181924004.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (4MB) | Request a copy |
|
Text
RAMA_56201_09011181924004_TURNITIN.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (5MB) | Request a copy |
|
Text
RAMA_56201_09011181924004_0022018703_01_front_ref.pdf - Accepted Version Available under License Creative Commons Public Domain Dedication. Download (1MB) |
|
Text
RAMA_56201_09011181924004_0022018703_02.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (598kB) | Request a copy |
|
Text
RAMA_56201_09011181924004_0022018703_03.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (632kB) | Request a copy |
|
Text
RAMA_56201_09011181924004_0022018703_04.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (2MB) | Request a copy |
|
Text
RAMA_56201_09011181924004_0022018703_05.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (10kB) | Request a copy |
|
Text
RAMA_56201_09011181924004_0022018703_06_ref.pdf - Bibliography Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (245kB) | Request a copy |
|
Text
RAMA_56201_09011181924004_0022018703_07_lamp.pdf - Accepted Version Restricted to Repository staff only Available under License Creative Commons Public Domain Dedication. Download (538kB) | Request a copy |
Abstract
Along with the development of the era of globalization, the use of technology has been very widespread in various industrial sectors, so data accumulates in a very fast time to grow into large-scale data called big data. The emergence of big data makes the formulation of optimization problems more complicated, because of the large volume and complexity of the data, therefore it is necessary to implement a parallel and distributed computer cluster architecture. There are several methods that support parallelization and computing systems to perform data processing such as MPI (Message Processing Interface), OpenMP (Open Multi Processing), Hadoop, Spark, and others. In the context of big data, many data structures in big data become more complex, high dimensions, and large sizes. This study utilizes the parallelization system of the Apache Spark framework system which is used as a medium to conduct distributed computer clusters to carry out big data processing. The results of this study showed that the distributed cluster system on spark effectively read big data, in the wordcount experiment on 31,788,324 rows of data, spark was faster with a time difference of 84.6 seconds. The performance produced in the spark library, MLlib, to conduct machine learning classification experiments and recommendation system to carry out advanced big data processing, the performance produced in the classification model gets the best value with an accuracy of 94.95%, F1-score 95%, recall 95.18%, and precision 94.77% of the 6 models used, while for the recommendation system with Algorithm ALS (Alternating Least Squares) got an RMSE score of 0.46 from 5 experiments with different tune parameters.
Item Type: | Thesis (Undergraduate) |
---|---|
Uncontrolled Keywords: | Komputasi Terdistribusi, Big Data, Cluster Komputer, Apache Spark |
Subjects: | Q Science > Q Science (General) > Q300-390 Cybernetics > Q325.5 Machine learning |
Divisions: | 09-Faculty of Computer Science > 56201-Computer Systems (S1) |
Depositing User: | Zainudin Zainudin |
Date Deposited: | 22 Nov 2023 07:04 |
Last Modified: | 22 Nov 2023 07:04 |
URI: | http://repository.unsri.ac.id/id/eprint/130876 |
Actions (login required)
View Item |