Introducing the Redis-full-check Tool

This article introduces the redis-full-check tool, which checks data consistency between two Redis databases and is used to check the correctness after migration.

By Zhu Zhao

Redis-full-check is a tool from the Alibaba Cloud Redis & MongoDB team that checks data consistency between two Redis databases and is usually used to check the correctness after Redis data migration (redis-shake).

Basic Principle

redis-full-check performs data verification by conducting a full comparison of the data between the source side and the target side in Redis. This comparison is performed by using the multi-round comparison method: The data from the source side and the target side is fetched for comparing the data differences and inconsistent data is recorded (in sqlite3 db) for the next-round comparison. After multiple rounds of comparison, data is continuously converged to reduce data inconsistency between the source database and the target database due to incremental data synchronization. The final data in sqlite is the final data differences.

The comparison conducted by redis-full-check is unidirectional: redis-full-check fetches data from source database A and checks if the data in A is also present in database B. It will not conduct reverse detection. That is, it checks whether the target database is a subset of the source database. If you want a bidirectional comparison, you need to compare data twice. The first comparison uses A as the source database and B as the target database. The second comparison uses B as the source database and A as the target database.

The following is the basic data flow diagram. redis-full-check uses the multi-round comparison, as shown in the yellow box. For each comparison, keys are fetched. In the first-round comparison, keys are fetched from the source database and the subsequent rounds of comparison fetch keys from sqlite3 db. After keys are fetched, the corresponding field and value of a key are fetched for comparison. Inconsistent data is stored in sqlite3 db for the next round of comparison.

Inconsistency Types

Redis-full-check divides data inconsistency into two types: key inconsistency and value inconsistency.

Key Inconsistency

Key inconsistency falls into the following subtypes:

lack_target: A key exists in the source database but does not exist in the target database.
type: A key exists both in the source database and the target database, but the type is inconsistent.
value: A key exists in both the source database and the target database and is of the same type, but the value is inconsistent.

Value Inconsistency

Different data types have different comparison criteria:

string: The value is different.
hash: A field exists and meets one of the following conditions:
A field exists on the source side but not on the target side.
A field exists on the target side but not on the source side.
A field exists both on the source side and the target side, but the value is different.
set/zset: similar to hash.
list: similar to hash.

The field conflict type falls into the following cases (only applicable to keys of types hash, set, zset, and list ):

lack_source: A field exists in a source-side key but not in a target-side key.
lack_target: A field does not exist in a source-side key, but the field exists in a target-side key.
value: A field exists both in a source-side key and a target-side key, but the values of the two fields are different.

Comparison Principle

Three compare modes (comparemode) are available:

KeyOutline: only compares if key values are equal.
ValueOutline: only compares if values have the equal length.
FullValue: compares if key values, value length, and values are equal.

The number of comparison rounds is determined by comparetimes (comparetimes is set to 3 by default):

In the first-round comparison, all keys in the source database are found. Then keys are fetched from the source database and the target database respectively.
The second round starts the iterative comparison and only compares inconsistent keys and fields found from the last round of comparison.
For key inconsistency (including lack_source , lack_target , and type), re-fetch keys and values from the source and the target databases for comparison.
For keys of string that have inconsistent values, compare these keys again: Fetch keys and values from the source and target databases.
For keys of hash, set, and zset that have inconsistent values, only re-compare inconsistent fields. Fields that have been compared and are found to be consistent do not need to be compared again. This prevents big keys from always failing the verification if updates are frequently performed.
For keys of list that have inconsistent values, re-compare keys: Fetch keys and values from the source and target values.
There is a specific interval between two rounds of comparison.

For big keys of hash, set, zset, and list, follow these rules:

If len is smaller or equal to 5192, use the following commands and fetch all fields and values for comparison: hgetall, smembers, zrange 0 -1 withscores, and lrange 0 -1.
If len is greater than 5192, use hscan, sscan, zscan, and lrange to batch-fetch fields and values.

Parameter Description

The following are the main parameters in redis-full-check:

 -s, --source=SOURCE               the source Redis database address (ip:port)
-p, --sourcepassword=Password     the password of the source Redis database
      --sourceauthtype=AUTH-TYPE    the management permission of the source database (This parameter is not required in open-source Redis.)
-t, --target=TARGET               the target Redis database address (ip:port)
-a, --targetpassword=Password     the password of the target Redis database
       --targetauthtype=AUTH-TYPE    the management permission of the target database (This parameter is not required in open-source Redis.)
-d, --db=Sqlite3-DB-FILE          the location in sqlite3 db where inconsistent keys are stored (result.db by default)
       --comparetimes=COUNT          comparison rounds
-m, --comparemode=                comparison mode
      --id=                         used for identifying metrics
      --jobid=                      used for identifying metrics
      --taskid=                     used for identifying metrics
  -q, --qps=                        QPS speed threshold
      --interval=Second             time interval between two comparison rounds
      --batchcount=COUNT            the amount of batch-aggregated data
      --parallel=COUNT              the number of parallel coroutines (5 by default)
      --log=FILE                    log file
      --result=FILE                 inconsistent results are recorded in the result file in this format: "db    diff-type    key    field"
      --metric=FILE                 metric file
  -v, --version

For example, the source Redis database is 10.1.1.1:1234 and the target database is 10.2.2.2:5678:

 ./redis-full-check -s 10.1.1.1:1234 -t 10.2.2.2:5678 -p mock_source_password -a mock_target_password --metric metric --log log --result result

The metric information uses the following format:

type Metric struct {
    DateTime     string       `json:"datetime"`      // time format: 2018-01-09T15:30:03Z
    Timestamp    int64        `json:"timestamp"`     // second-level unix timestamp
    Id           string       `json:"id"`            // run id
    CompareTimes int          `json:"comparetimes"`  // comparison rounds
    Db           int32        `json:"db"`            // db id
    DbKeys       int64        `json:"dbkeys"`        // the total number of keys in the db
    Process      int64        `json:"process"`       // progress percentage
    OneCompareFinished bool                               `json:"has_finished"` // indicates if this comparison has finished
    AllFinished        bool                               `json:"all_finished"` // indicates if all comparisons have finished
    KeyScan      *CounterStat `json:"key_scan"`      // the number of scanned keys
    TotalConflict      int64  `json:"total_conflict"` // total conflicts, including keys + fields
    TotalKeyConflict   int64                              `json:"total_key_conflict"`  // total key conflicts
    TotalFieldConflict int64                              `json:"total_field_conflict"` // total field conflicts
     // For the two following maps, the first-layer key is of type string, including string, hash, list, set, and zset. The second key is the conflict types, including type, value, lack source, lack target, and equal.
    KeyMetric    map[string]map[string]*CounterStat `json:"key_stat"`  // key metric
    FieldMetric  map[string]map[string]*CounterStat `json:"field_stat"`  // field metric
}

type CounterStat struct {
    Total int64 `json:"total"` // total
    Speed int64 `json:"speed"` // speed
}

Sqlite 3 DB File

Results will be saved in the sqlite3 db file. If no file is specified, the result.db file under the current directory is used. If a third comparison round exists, the three following files are present: result.db. 1, result.db. 2, and result.db. 3.

Table key: saves inconsistent keys
Table field: saves inconsistent fields of hash, set, zset, and list. The list saves subscript values.
The key_id field in the table field is associated with the id field in the table key.
Table key_<N> and field_<N>: save the results after the N comparison round (that is, the intermediate results).

Example:

$ sqlite3  result.db

sqlite> select * from key;
id          key              type        conflict_type  db          source_len  target_len
----------  ---------------  ----------  -------------  ----------  ----------  ----------
1           keydiff1_string  string      value          1           6           6
2           keydiff_hash     hash        value          0           2           1
3           keydiff_string   string      value          0           6           6
4           key_string_diff  string      value          0           6           6
5           keylack_string   string      lack_target    0           6           0
sqlite>

sqlite> select * from field;
id          field       conflict_type  key_id
----------  ----------  -------------  ----------
1           k1          lack_source    2
2           k2          value          2
3           k3          lack_target    2

Reference Materials for the open-source project

Here are some reference Materials for the open-source project:

redis-full-check

Data migration tool redis-shake

Feel free to post your problems or suggestions in Issues on GitHub. You are welcome to join our open-source project development.

Community

Introducing the Redis-full-check Tool

Basic Principle

Inconsistency Types

Key Inconsistency

Value Inconsistency

Comparison Principle

Parameter Description

Sqlite 3 DB File

Reference Materials for the open-source project

Read previous post:

Read next post:

ApsaraDB

You may also like

Comments

ApsaraDB

Related Products

Tair (Redis® OSS-Compatible)

ApsaraDB RDS for MySQL

ApsaraDB RDS for PostgreSQL

ApsaraDB RDS for MariaDB