radio free coyote Archive Pages Categories Tags

distributed database overview

24 May 2022

Distributed DB Overview

Document Scope

This document is intended to be a very rough and high level technical overview of recent generation Distributed Database system and is to be used as an introduction to key concepts and features. More reading in links.

Distributed Data store common features

See also Distributed Hash Table

Distributed DB vs. Centralized DB

Some very broad comparisons.

CAP Theorem

CAP Theorem (a.k.a. Brewer’s Theorem) states that any distributed data store can only provide 2 of the following three guarantees

When a network failure occurs the distributed data store will need to:

Comparison of example DDBMSes

DB CAP Type Notes
Mongo / Redis / Memcache CP Node will shut down rather than serve old or out of date data
Cassandra / ScyllaDB / Couch DB / RIAK AP Node will stay up and serve old/out of date data and update once network partition is resolved
Aerospike AP or CP Database can be configured as Available and Partition-tolerant (AP) by default or Consistent and Partition-tolerant (CP) (e.g. Aerospike strong consistency mode).

Consensus protocols

Most common protocols used include:

Consensus Protocols

Paxos is a protocol family used for solving consensus (finding an agreement on a result) between nodes of a distributed system. Solving consensus allows a distributed system to continue working in a predictable way during network partitioning or process/server failure.

Consensus is generally defined as agreement among processes on a single value. It is assumed that some processes can fail or become unreliable so the protocols must be fault-tolerant. This is generally done by having the various processes communicate the value to other processes and agree on a single value.

Paxos is used when durability is required for handling large amounts of data. Paxos will attempt to make progress during episodes when replicas are unavailable.

Raft vs. Paxos - A major difference between Raft and Paxos is that Raft only allows servers with updated logs to become leader while in Paxos, any server can become the leader so long as the logs are updated subsequently. This is done to simplify the leader election process and make it more efficient.