Virtual Machine Live Migration for Disaster Recovery in WANs

Project Details

Project Lead
Tae Seung Kang 
Project Manager
Tae Seung Kang 
Institution
University of Florida, Advanced Computing and Information Systems Laboratory  
Discipline
Computer Science (401) 

Abstract

Wide-area virtual machine (VM) live migration technology can be used as a disaster recovery solution for IT services by moving virtualized servers to safe locations upon a critical disaster. In this scenario, it is desirable to evacuate as many VMs in a datacenter as possible without deteriorating the network performance. The challenges are that 1) if we migrate a large number of VMs simultaneously, the migration times of individual VM increases, which would result in the high probability of migration failures due to power or network link breakdown and 2) network conditions fluctuate over time. Therefore, each individual VM needs to be migrated as quickly as possible. FutureGrid will allow us to investigate the characteristics of the network when migration of VMs takes place in real WANs.

Intellectual Merit

The proposed system monitors the network performance of the hosts, adjusts migration parameters, and coordinates the migration scheduling of VMs. It is a promising approach to efficiently transfer IT services from a damaged datacenter to a fully functional one by automatically managing migrations across datacenters.

Broader Impacts

Many IT services are being deployed in cloud environments, and encapsulated into virtual machines (VMs). In this scenario, it is possible to take advantage of the mobility of VMs to implement a best-effort disaster recovery system. The basic idea is to migrate as many VMs as possible while electrical power and network resources remain active during and after a disaster. This approach can be substantially cheaper compared to traditional disaster recovery systems, which require constant data synchronization and remote backup. VM migration does not need any communication between a disaster and safe sites until the migration process starts.

Scale of Use

Experiments will be conducted on 10-20 VMs.