Deployment of Virtual Clusters on a Commercial Cloud Platform for Molecular Docking

Project ID
FG-445
Project Categories
Computer Science
NSF Grant Number
N/A
NSF Grant URL
N/A
Completed
Abstract
The project aims to create and deploy virtual clusters that run a protein-ligand molecular interaction simulation program called DOCK to FutureGrid. This will allow tasks to be performed on a large scale cheaply and efficiently. Three areas will be investigated: 1) The elasticity of the virtual clusters, 2) the fault tolerance of the system, and 3) the use of several virtual clusters on various commercial clouds to form a single system. By utilizing commercial and other clouds, like FutureGrid, the system performance will be increased and will allow millions of protein-ligand interaction simulations to be run in a massively parallel manner.
Use of FutureSystems
FutureGrid will be used to upload virtual clusters that can run protein-ligand interaction simulations. It will also be used alongside commercial and noncommercial clouds (Microsoft Azure and AIST Cluster) that will have the same type of virtual clusters. Tasks will be distributed across the multiple cloud environments and the clouds will be set up to communicate with each other with regards to task distribution and failures.

Scale of Use
Scale of use depends on which task is being done. There are 3 tasks that are to be accomplished. First is doing multi-cloud work that involves virtual clusters on FutureGrid communicating with virtual clusters on other clouds. For this task, only about 5-10 VMs will be used. The other two tasks involve looking at Fault Tolerance and Elasticity of the Virtual Machine Clusters. For these two tasks, hundreds to thousands of VMs will be used in conjunction with Hadoop/MapReduce as we are hoping to observe the limitations of this method of computing.