Export data to SST

This document describes how to use BR Raw KV backup capability to export KV data to remote storages as SST files.

Basic Usage

br backup raw --pd ⟨pd address⟩ \
  -s ⟨storage url⟩ \
  --start ⟨start key⟩ \
  --end ⟨end key⟩ \
  --format ⟨key format⟩ \
  --ratelimit ⟨in MiB/s⟩

This will export all KV data in the range [start key, end key) to the specified storage in the format of SST.

Supported storage

The storage URL support following schemes:

Local filesystem, distributed on every nodelocallocal:///path/to/dest/
Hadoop HDFS and other compatible serviceshdfshdfs:///prefix/of/dest/
Amazon S3 and other compatible servicess3s3://bucket-name/prefix/of/dest/
GCSgcs, gsgcs://bucket-name/prefix/of/dest/
Write to nowhere(for benchmark only)noopnoop://

S3 and GCS can be configured using URL and command line parameters, see the BR documentation External Storage for more information.

HDFS configuration

To use HDFS storage, Apache Hadoop or compatible client should be installed and currectly configured on all BR and TiKV machines. The bin/hdfs binary in hadoop installation will be used by BR and TiKV.

Various configuration should be provided for HDFS storage to work, see the following table.

ComponentConfigurationEnvironment variableConfiguration file item
BRhadoop installation directoryHADOOP_HOME(None)
TiKVhadoop installation directoryHADOOP_HOMEbackup.hadoop.home
TiKVlinux user to use when calling hadoopHADOOP_LINUX_USERbackup.hadoop.linux-user

For TiKV, the configuration file have higher priority than environment variables.

Parse SST file

Java Client

The exported SST file can be parsed using Java Client.