Apache ZooKeeper

Building A "Simple" Distributed System - The Implementation

Building A "Simple" Distributed System - The Implementation

In previous posts we’ve identified the requirements for our distributed resource allocation library, including one invariant (No Double Resource Access) that must hold 100% of the time no matter what and another (All Resources Evenly Accessed) that must hold after a successful rebalancing of resources. We documented a protocol that describes how nodes interact with a central registry to achieve the requirements, including how they deal with all conceived failure scenarios. Then we built a TLA+ specification and used the model checker to verify the designed protocol, identifying a defect in the process.

In this post we’ll tackle the implementation and in the next we’ll look at testing.