Course: 1st Workshop on bioKepler Tools and Its Applications

Project Details

Project Lead
Ilkay Altintas 
Project Manager
Jianwu Wang 
Project Members
Jianwu Wang, Shweta Purawat, Jianwu Wang, Wanghu Chen, Daniel Crawl, Shweta Purawat  
Supporting Experts
Shava Smallen,  
Institution
UCSD, SDSC  
Discipline
Computer Science (401) 
Subdiscipline
26.01 Biology, General 

Abstract

In this workshop bioinformaticians and computational biologists will meet with the bioKepler team to clarify specific technical requirements for bioKepler tool and workflow development. The two focus areas for the workshop are (i) evaluation of bioinformatics and computational tools for bioActor development; and (ii) generation of bioinformatics workflows based on conceptual workflows presented by workshop attendees. As a part of this workshop, the organizers will conduct a formal survey of use cases and translate them into functional bioKepler requirements.  
 
We request use of FutureGrid to run a instances of the bioKepler VM, originally developed on EC2.  

Intellectual Merit

bioKepler is a three year long project that builds scientific workflow components to execute a set of bioinformatics tools using distributed execution patterns. Once customized, these components are executed on multiple distributed platforms including various Cloud and Grid computing platforms. In addition, we deliver virtual machines including a Kepler engine and all bioinformatics tools and applications we are building components for in bioKepler.

Broader Impacts

Technologies like scientific workflows and data-intensive computing promise new capabilities to enable rapid analysis of these next-generation sequence data. These technologies, when used together in an integrative architecture, have great promise to serve many projects with similar needs on the emerging distributed data-intensive computing resources.

Scale of Use

Start up 28 instances of biokepler VM for use during workshop.

Results

The workshop went well with the virtual instances with a few hiccups. As Koji suggested, I tried to start instances one or two days before the workshop. I met a few kinds of errors (one returns error message directly after the start-instance command, another shows the instance status is error and I cannot login). In the end, I was able to start 28 instances and can access every instance. The instances kept running correctly in the first morning of the workshop. Yet suddenly, all my instances are gone around lunch time. So in the afternoon, I restarted 24 instances.  I didn't see any error. This time, I didn't get enough time to test all instances. During the hands-on session the afternoon of the second day (Yesterday), I let attendees to access the instances. Quite a few (around five) cannot ssh into their instances. The instances show correct public IP address but ssh shows no route to reach them. I had to let some members of our host team to give their instances for other attendees. The good thing is that all instances that we can access work well throughout the afternoon session.

A list of attendees can be found at http://swat.sdsc.edu/biokepler/workshops/2012-sep#info.