HyFlowTM

Project Details

Project Lead
Roberto Palmieri 
Project Manager
Roberto Palmieri 
Project Members
Roberto Palmieri  
Institution
Virginia Tech, Department of Electrical and Computer Engineering  
Discipline
Computer Science (401) 
Subdiscipline
14.09 Computer Engineering 

Abstract

The HyFlowTM project is developing distributed transactional memory (or DTM) as an alternative to lock-based distributed concurrency control. HyFlow is a framework for DTM, with pluggable support for policies for directory lookup, transactional synchronization and recovery, contention management, and cache coherence. HyFlow exports a distributed programming model that excludes locks: using (Java 5’s) annotations, a programmer can define atomic sections as transactions, in which reads and writes to shared, local and remote objects appear to take effect instantaneously. No changes are needed to the underlying virtual machine or compiler.

Intellectual Merit

We propose fault-tolerant distributed transactional memory. Distributed transactional memory (TM) promises to alleviate the problems of lock-based distributed concurrency control-- e.g., distributed deadlocks, livelocks, and lock convoying. Distributed TM exports a simple programming interface, which avoids locks. Fault-tolerance is essential to distributed TM to cope with node/network failures, but that must be achieved with scalability. Object replication is central to fault-tolerant D-STM. However, replication protocols must allow read concurrency and ensure transactional consistency and progress properties, while being message-efficient on metric-space networks (i.e., one in which communication cost between nodes form a metric). This is an open problem. Broadcasting -- classical replication primitive in database systems -- is inherently non-scalable in metric-space networks, as messages transmitted grow quadratically with the number of nodes. The project proposes a novel replicated distributed TM framework, whose key idea is to split a transaction into two independent phases with orthogonal responsibilities: 1) regular read/write phases, in which latest copies of required objects are quickly collected, in the presence of node/link failures, without any consistency guarantees, and 2) a request-commit phase, in which conflicts are detected and consistency is ensured by consulting the results collected from conflict resolution modules of other nodes. The project proposes quorum-based replication protocols, in which a transaction reads an object by reading object copies from a read quorum, and writes an object by writing copies to a write quorum. Additionally, the project proposes correctness criterion including weaker consistency models, and distributed conflict resolution protocols and cache coherence protocols. The proposed framework and protocols will be implemented in the HyFlow distributed TM Java package, which is open-source and freely distributed.

Broader Impacts

Project's broader impacts include: 1) integration of the research results into a graduate course taught at Blacksburg, VA and VT-MENA/Egypt; 2) dissemination of the project's results through an open-source implementation of the project's distributed TM infrastructure and publication of results at relevant conferences; and 3) increasing the cultural interaction between students and faculty in the US and those in the Middle East and North Africa region, by advisement of VT-MENA students and teaching an advanced course at VT-MENA/Egypt.

Scale of Use

The size of the system is composed by several VMs for not a long time (e.g., 20 VMs for 1 day)

Results

This is the link of the project http://www.hyflow.org/hyflow and here there are all the papers and technical reports in the context of the project: http://www.hyflow.org/hyflow/wiki/Publications