TiKV RoadMap

Roadmap to TiKV 3.0

We are currently refactoring our documentation. Please excuse any problems you may find and report them here.

This document describes the roadmap for TiKV development. As an open source project, TiKV is developed by a community of contributors and adopted by many of them in production. That’s where most of the goals on the roadmap come from.

Let us know on Slack if you have any questions regarding the roadmap.

TiKV 3.0 goals

  • Raft
    • Region Merge - Merge small Regions together to reduce overhead
    • Local Read Thread - Process read requests in a local read thread
    • Split Region in Batch - Speed up Region split for large Regions
    • Raft Learner - Support Raft learner to smooth the configuration change process
    • Raft Pre-vote - Support Raft pre-vote to avoid unnecessary leader election on network isolation
    • Joint Consensus - Change multi members safely.
    • Multi-thread Raftstore - Process Region Raft logic in multiple threads
    • Multi-thread apply pool - Apply Region Raft committed entries in multiple threads
  • Engine
    • Titan - Separate large key-values from LSM-Tree
    • Pluggable Engine Interface - Clean up the engine wrapper code and provide more extensibility
  • Storage
    • Flow Control - Do flow control in scheduler to avoid write stall in advance
  • Transaction
    • Optimize transaction conflicts
    • Distributed GC - Distribute MVCC garbage collection control to TiKV
  • Coprocessor
    • Streaming - Cut large data set into small chunks to optimize memory consumption
    • Chunk Execution - Process data in chunk to improve performance
    • Request Tracing - Provide per-request execution details
  • Tools
    • TiKV Importer - Speed up data importing by SST file ingestion
  • Client
    • TiKV client (Rust crate)
    • Batch gRPC Message - Reduce message overhead

PD

  • Improve namespace
    • Different replication policies for different namespaces and tables
  • Decentralize scheduling table Regions
  • Scheduler supports prioritization to be more controllable
  • Use machine learning to optimize scheduling
  • Optimize Region metadata - Save Region metadata in detached storage engine