Introduction

In this project, I implemented a distributed commit system using two-phase commit to ensure system consistancy when different types of failure occurs.

Major Designs

The system structure is shown in the following figure.

Figure 1. System Structure

Two-phase Commit

Two-phase commit is a method for distributed system to maintain correctness when failure happens. A typical two-phase commit includes a vote phase and a distribute phase. This project utilized the basic idea of two-phase commit and built upon it to make sure all commits are correct.

Flow Chart

The flow chart shows as Figure 2. It clearly defines the behavior of server and user node.

Figure 2. System Flow Chart

For every commit request, the server will create an independent CommitProcess thread to handle the request. The CommitProcess thread will perform a two-phase commit with all related users. First, it will initiate the “vote phase” to ask all related users for their opinion. If any user response “no” or failed to response, the CommitProcess will mark result as “no” and initiate a distribute phase to notify this result to all clients. On distribute phase, a timeout message will be resend until acknowledgement is received.

Protocol

The request and response message includes the following fields:

Request Message Structure

Response Message Structure

Timeout thresholds

Since a message is guaranteed to arrive within 3 seconds, the maximum round-trip time would be 6 seconds. So, my timeout threshold is set to 6 seconds.

Failure Recovery

This system is designed to be robust on failure. Two major failures are properly handled in this system.