Performance Evaluation of HDFS and FusionFS

Project Details

Project Lead
Jayalakshmi Doddamane 
Project Manager
Jayalakshmi Doddamane 
Project Members
sumathi kurlageri  
Institution
M S Ramaiah Institute of Technology, Department of Computer Science & Engineering  
Discipline
Computer Science (401) 

Abstract

As the amount of data is explosive increasing in various fields of arts, science and engineering. The computation of such huge data has become one of the challenges. In order to support  the execution of such computation an new infrastructure have emerged in that most of them involve distributed computing  environment  for parallel computing process among the nodes of a large distributed computing.
Distributed Computing environment handles large scale data in several applications by providing better performance. One of the well known solutions is the using of Distributed File Systems (DFSs). Distributed file system (DFS) allows users of physically distributed computers to share data and storage resources by using a common file system. As there are number of different DFS, most popularly used is HDFS (Hadoop Distributed File System).  When we consider the performance of HDFS it is compared with other distributed file system  but it is not compared with new DFS such as FusionFS. As Mapreduce is implemented on HDFS, but not on FusionFS. Hence the main goal is to run MapReduce on FusionFS and evaluate the performance of HDFS and FusionFS by considering various parameters. 

Intellectual Merit

The project provides a comparative evaluation of HDFS v/s FusionFS with respect to MapReduce applications. The project is being carried out by a Masters student who will use FutureSystems to set up the test environment.

Broader Impacts

This project is part of a research programme evaluating different distributed file systems for data intensive cloud applications.

Scale of Use

a few VMs for experimental set up of Hadoop cluster