A highly-available move operation for replicated trees
Martin Kleppmann, Dominic P. Mulligan, Victor B. F. Gomes, and Alastair R. Beresford
IEEE Transactions on Parallel and Distributed Systems 33(7):1711–1724,
October 2021.
Abstract
Replicated tree data structures are a fundamental building block of distributed filesystems, such as
Google Drive and Dropbox, and collaborative applications with a JSON or XML data model. These
systems need to support a move operation that allows a subtree to be moved to a new location
within the tree. However, such a move operation is difficult to implement correctly if different
replicas can concurrently perform arbitrary move operations, and we demonstrate bugs in Google Drive
and Dropbox that arise with concurrent moves. In this paper we present a CRDT algorithm that handles
arbitrary concurrent modifications on trees, while ensuring that the tree structure remains valid
(in particular, no cycles are introduced), and guaranteeing that all replicas converge towards the
same consistent state. Our algorithm requires no synchronous coordination between replicas, making
it highly available in the face of network partitions. We formally prove the correctness of our
algorithm using the Isabelle/HOL proof assistant, and evaluate the performance of our formally
verified implementation in a geo-replicated setting.