Bundles: Distributed Cloud Resources

Project ID
FG-305
Project Categories
Computer Science
Project Keywords
Completed
Abstract
Objectives: Extreme-scale collaborative science requires that the complexity of the underlying environment in terms of diverse storage, network, and computing platforms must be managed on the one hand, yet exploited on the other. The evolution of science applications in terms of new algorithms, improved fidelity, and integration of data, has proceeded in parallel with the evolution of advanced network services such as, for example, QoS on optical networks. However, the evolution of middleware that glues the two layers together has evolved much more slowly. What remains is a gap: lower-level services are often hard to use and represent a disruption to the flow of the application. Going the other direction, mechanisms to supply critical application characteristics that could shape the runtime behavior of the application with respect to low-level resource usage are not routinely available. A structured and standard approach to addressing these concerns does not exist, either at the middleware level, or in the form of services or tools. This leads to isolated, repeated, and non-extensible solutions. This is not a scalable solution in the long run. What features must middleware for extreme-scale environments provide? How should it be organized? Description: This project aims to bridge the gap between application requirements and diverse and heterogeneous platforms, by developing middleware framework that can support the needs of tools and services in support of distributed scientific collaborative applications at extreme scales. We consider applications that operate on rich data-pipelines in a distributed collaborative environment including: data generation and capture, data preprocessing, data analysis, and data storage and delivery. Tools and services are needs at all levels of this pipeline to enable data discovery, data transmission and streaming, data placement and storage, resource discovery, computation scheduling, and co-scheduling. A framework-based middleware provides an integrated way to address the many co-dependent issues in extreme-scale environments such as the emergence of disparate resource platforms and network capabilities, the inherent distribution of compute and data, multiple-levels of application and run-time decision making. We propose a middleware framework that provides powerful abstractions for distributed computational and storage resources, and containers for computational tasks and distributed data. This project will address fundamental research challenges required to realize these abstractions, including techniques to enable fault tolerance and performance for both data and computation. Our framework would enable a varied set of tools to be more easily constructed such as workflow systems, in-situ data processing, to name a few. We will use FG resources as a prototype of the Bundle abstraction.
Use of FutureSystems
Research prototyping
Scale of Use
Not sure yet -- mostly need wide-area dispersion with a modest number of resources per site (most likely)