Distributed Real-time Computation System

Project Details

Project Lead
Yukai Xiao 
Project Manager
Yukai Xiao 
Project Members
Hsi-Yun Cheng, Wenlien Tsao, Tianhao Cao  
Indiana University Bloomington, computer science department  
Computer Science (401) 


This is a course project, my team is going to learn about real-time computation system Storm, and compare it with Hadoop. Following is our work steps: 1. Deploy Apache Storm and Hadoop on FutureGrid 2. Test some basic testing projects on both systems such as WordCound, BLAST and PageRank. Compare the performance 3. Test real-time data project, we've decided to use Twitter tweet data 4. After finished previous steps, we will consult professor for futher work

Intellectual Merit

Comparsion between Real-time Distributed Computation System and Hadoop MapReduce in Batch processing

Broader Impacts

First, we can compare the performance between Storm and Hadoop in Batch processing. Then we hope to find or develop a 'real-time' version hadoop, and test on real-time processing with Storm. The results can be very useful.

Scale of Use

Not sure yet, we need to figure out how to deploy Storm first. I think we will need several physical maches and run computations for few hours several times a week in the next two months.