Benchmarking Hadoop and Spark

Project ID
FG-540
Project Categories
Computer Science
Project Keywords
Completed
Abstract
Apache Spark is a framework targeted for cluster-computing. The extremely fast processing of the spark makes it run 100 times faster than Hadoop. Spark is installed natively on the Rasberry Pi cluster with the one master and four workers. The main goal is to create a spark cluster and perform the relevant configurations in order to compare and contrast the difference in performance, ease of deployment, flexibility and scalability on various platforms like direct installation on Rasberry Pi cluster, docker and Future systems Echo.
Use of FutureSystems
We need to run the spark program on the Future Echo systems and compare the run times to analyze the performance on the different platforms.
Scale of Use
I want around 7 days to run the proposed project and analyze the run times on the echo systems.