Course: Cloud Computing for Data Intensive Science Class

Project Details

Project Lead
Judy Qiu 
Project Manager
Tak-Lon Wu 
Project Members
Peng Chen, Vignesh Ravindran, Santhosh Kumar Saminathan, Lilian Weng, Fei Teng, Nabeel Akheel, Kaushik Chandrasekaran, Arvind Dwarakanath, Dhairya Gala, Abhinav Gopisetty, Swathi Gurram, Shivaraman Janakiraman, Hui Li, Sankarbala Manoharan, Anand Mukundan, Vaibhav Nachankar, Priyank Shah, Prerna Shraff, Doga Tuncay, Magesh khanna Vadivelu, Bingjing Zhang, Bina Bhaskar, Nitya Shankaran, Ritika Sharma, Hemanth Gokavarapu, Prajakta Purohit, Anand Hegde, Xiaoyang Chen, Manish Kantamneni, Yuan Gao  
Indiana University, School of Informatics and Computing  
Computer Science (401) 


A topics course on cloud computing for Data Intensive Science with 24 graduate students at Masters and PhD level offered Fall 2011 as part of Computer Science curriculum

Intellectual Merit

Several new computing paradigms are emerging from large commercial clouds. These include virtual machine based utility computing environments such as Amazon AWS and Microsoft Azure. Further there are also a set of new MapReduce programming paradigms coming from Information retrieval field which have been shown to be effective for scientific data analysis. These developments have been highlighted by a recent NSF CISE-OCI announcement of opportunities in this area. This class covers many of the key concepts with a common set of simple examples. It is designed to prepare participants to understand and compare capabilities of these new technologies and infrastructure and to have a basic idea as to how to get started. Particularly, the Big Data for Science Workshop Website covers the background and topics of interest as below. Projects include Bioinformatics and Information retrieval

Broader Impacts

This material will generate curricula material that will be used to build up an online distributed systems/cloud resource

Scale of Use

Modest resources for each student


See class web page

This class involved 24 Graduate students with a mix of Masters and PhD students and was offered fall 2011 as part of Indiana University Computer Science program. Many FutureGrid experts went to this class which routinely used FutureGrid for student projects. Projects included

  • Hadoop
  • DryadLINQ/Dryad
  • Twister
  • Eucalyptus/Nimbus
  • Virtual Appliances
  • Cloud Storage
  • Scientific Data Analysis Applications