Course: P434 MapReduce Class Project

Project Details

Project Lead
Scott Jensen 
Project Manager
Jingya Wang 
Project Members
Jingya Wang, Zachary Owens, John Bickel, Grant Redfield, Raj Xavier, Jacob Thompson, Evan Boggs, Jonathan Imlay, Jeremy Falkmann, Devon Thomas, Bryce Hughes, Cj Zhu, Paul Vasich, Jacob Read, Joe Maguire, Ryan Fitzpatrick  
Institution
Indiana University, Computer Science  
Discipline
Computer Science (401) 

Abstract

This project is for an undergraduate distributed systems course (P434) taught by Scott Jensen at Indiana University for the Fall 2012 semester. In this project the students will use MapReduce to perform the ray tracing of an image based on a specification we provide them. Prior to this project they will have implemented a web service using Axis2 that provides a ray tracing service, but in that case the multi-threading is done on the client side to partition the image specification and submit multiple requests to the server. The client is then responsible for stitching the partitioned image together. In this second stage of the project, students will use MapReduce running on FutureGrid to perform the parallel computation for the ray tracing of the image using MapReduce to map the image specification as parallel computations and then reconstruct the final image in the reduce phase. The goal of the project is to introduce students to the MapReduce paradigm and through the use of FutureGrid, provide them the opportunity to learn about virtualized computing resources and cloud computing.

Intellectual Merit

Through the use of Future Grid, students will have the opportunity to learn about MapReduce, virtualization, and cloud computing through an invaluable hands-on experience that will reinforce what they are learning in the lectures and through reading research articles. By being able to apply what they are learning, it provides them with a better understanding of the materials that would not otherwise be possible.

Broader Impacts

At the completion of the project, the students will have a greater understanding of cloud computing and virtualization and the role of cyberinfrastructure.

Scale of Use

There are 15 students in the class, an instructor, and an associate instructor. The students will be running their parallel ray tracing jobs using a small set of VMs. In prior offerings of this class, a restriction of 4 VMs worked fine. Each run is expected to take less than 1/2 hour, and students will be told to release the resources after they have completed their runs.