TiKV deep dive
- 1. Introduction
- 2. Consensus algorithm
- 3. Key-value engine
- 4. Distributed transaction
- 5. Scalability
- 6. Remote Procedure Calls (RPC)
- 7. Resource scheduling
- 8. Distributed SQL over TiKV
- 8.1. Distributed SQL
In 2000, Eric Brewer presented “Towards Robust Distributed Systems” which detailed the CAP Theorem. Succinctly, the theorem declares that a distributed system may only choose two of the following three attributes:
- Consistency: Every read receives the most recent write or an error
- Availability: Every request receives a (non-error) response – without necessarily considering the most recent write
- Partitioning: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) between nodes
Traditional RDBMS, like PostgreSQL, that provide ACID guarantees, favor consistency over availability. BASE (Basic Availability, Soft-state, Eventual consistency) systems, like MongoDB and other NoSQL systems, favor availability over consistency.
In 2012, Daniel Abadi proposed that CAP was not sufficient to describe the trade-offs which occur when choosing the attributes of a distributed system. They described an expanded PACELC Theorem:
Availability (A) and consistency (C) (as per the CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C).
In order to support greater availability of data, most systems will replicate data between multiple peers. The system may also replicate a write ahead log offsite. In order to fulfill these availability guarantees the system must ensure a certain number of replications have occured before confirming an action. More replication means more consistency but also more latency.