Cloud-Based Support for Distributed Multiscale Applications

Project ID
FG-99
Project Categories
Computer Science
Completed
Abstract
Multiscale modeling is one of the most significant challenges which science faces today. The goal of our research is to build an environment that supports composition of multiscale simulations from single scale models encapsulated as scientific software components and distributed in the grid and cloud e-infrastructures. We plan to construct a programming and execution environment for such applications. We are going to investigate and integrate solutions from: 1. virtual experiment frameworks, such as the GridSpace Platform (http://dice.cyfronet.pl/gridspace) 2. tools suporting multiscale computing such as MUSCLE (http://muscle.berlios.de) 3. Cloud, Grid and HPC infrastructures We plan to extend the capabilities of the GridSpace platform developed as a basis for the Virtual Laboratory in the ViroLab project (http://www.virolab.org) and currently further developed in Mapper project (http://www.mapper-project.eu). GS is a framework enabling researchers to conduct virtual experiments on Grid-based resources, Cloud resources and HPC infrastructures. We have already performed several experiments using GridSpace with multiscale simulations: 1. modules taken from AMUSE framework (http://www.amusecode.org) were orchestrated by a GridSpace experiment and communicated using High Level Architecture (IEEE standard 1516) [ComHLA]; 2. modules of the computational biology application were orchestrated by GridSpace experiment and communicated using MUSCLE. Both experiments shared a local cluster with a Portable Batch System (PBS). Thanks to Future Grid resources we hope to acquire the possibility to experiment and compare results of multiscale simulations on Cloud resources. As case studies, we plan to investigate the following applications: 1. In-stent Restenosis; an application which simulates biological responses of cellular tissue for the treatment of atheriosclerosis based on complex automata [ISR]. 2. The Nanopolymer application which uses the LAMMPS Molecular Dynamics Simulator (http://lammps.sandia.gov/). 3. The Brain Aneurism application from the VPH-Share project, which attempts to model cerebral blood flow dynamics (http://uva.computationalscience.nl/research/projects/vph-share); References: [GS] E. Ciepiela, D. Harezlak, J. Kocot, T. Bartynski, M. Kasztelnik, P. Nowakowski, T. Gubała, M.Malawski, M. Bubak; Exploratory Programming in the Virtual Laboratory, in Proceedings of the International Multiconference on Computer Science and Information Technology pp. 621–628 [ISR] Alfons G. Hoekstra, Alfonso Caiazzo, Eric Lorenz, Jean-Luc Falcone, Bastien Chopard, Complex Automata: multi-scale Modeling with coupled Cellular Automata, in A. G. Hoekstra, J. Kroc, and P. M. A. Sloot (Eds.) Modelling Complex Systems with Cellular Automata, Spinger Verlag, July 2010. [ComHLA] K. Rycerz, M. Bubak, P. M. A. Sloot: HLA Component Based Environment For Distributed Multiscale Simulations In: T. Priol and M. Vanneschi (Eds.), From Grids to Service and Pervasive Computing, Springer, 2008, pp. 229-239
Use of FutureSystems
We plan to compare the behavior of multiscale applications in two environments:

(1) cluster with local resource management system using separate cluster nodes for each simulation module

(2) cloud of VMs - using separate VM for each simulation module.

we plan to compare:

(1) application setup time, i.e. how long it takes to start the application in selected environments.

(2) application execution time.
Scale of Use
We would like to use the Eucalyptus installation on India and Sierra clusters and compare the results with HPC jobs (PBS). For instent restenosis application we plan to run about 8-10 VMs for a single experiement (run on 8-10 nodes). For prototyping and development, we plan to run a set of simple experiments that will not consume much resources. For performance tests we plan to conduct larger experiment (execution time on the order of 72 hours, 4 GB of output data.)

For nanopolymer application we will need ca. 32 nodes.

For aneurysm simulation applications we will need ca. 128 nodes.

Additionally, we would like to compare Eucalyptus with Nimbus, OpenStack and OpenNebula.

We would require approximately 12 months to develop and test the whole system.