Scalability, Verification, and Validation Test of Scientific Applications and Related Utilities

Project Details

Project Lead
Lonnie D. Crosby 
Project Manager
Reuben Budiardja 
Institution
University of Tennessee, Knoxville, National Institute for Computational Sciences  
Discipline
Computer Science (401) 

Abstract

This FutureGrid allocation will enable scalability, verification, and validation (V&V) tests of multi-discipline scienfitific applications. The allocation will be used to reproduce potential platform-specific issues found on similar productions HPC resources as those provided by FutureGrid. This allocation will also be used to test new software and computing capabilities that require configuration changes on the platform. These configuration changes often cannot be easily done in a production HPC environment without a large disruption to users. Therefore, we will use this FutureGrid allocation for testing and the determination of issues, risks, and reliability of configuration changes. The allocation may also be used to assist staff in preparing documentation concerning new capabilities prior to their implementation on production resources.

Intellectual Merit

Many HPC center only support certain computational platforms in a single known configuration. The risk of disruption associated with changing this known configuration to an untested, undocumented, and new featured configuration is too high to be practical. The National Institute for Computational Sciences (NICS) delivers nearly half of the computational hours to users within the Extreme Science and Engineering Discovery Environment (XSEDE). In ordert to fulfill obligations to the National Science Foundation (NSF) and mainatain reasonable job throughput and user satisfaction, these new configurations must be tested, verified, and documented prior to incorporation in production resources. When a system-specific issue is found either within an application and/or related tools / utilities, it is highly useful to be able to reproduce the issue on another platform with a similar configuration. This allows for a higher degree of confidence in determining whether the issue is related to the system or user software. Although not definitive by itself, the additional information can prove useful in tracking down the source of a problem.

Broader Impacts

NICS is one of the largest XSEDE resource provider. Results from this activity for better support, better scalability, and verification of scientific applications, tools, and utilities will have a large impact across multi-discipline XSEDE users. The ability to test configuration changes without disruption to production uses will allow users the opportunity to utilize new features on a shorter timeframe than previously possible. The testing of application/resource issues on similar resources will allow us to more rapidly rule in/out system specific problems.

Scale of Use

Medium to large scale, although the scale may vary depending on the needs. Activities may be sporadic with some bursts through out the allocation period.