調用ComputeSplitPointsBySize將全表資料劃分成指定大小的若干分區 - Tablestore

調用ComputeSplitPointsBySize介面將全表的資料在邏輯上劃分成接近指定大小的若干分區，返回這些分區之間的分割點以及分區所在機器的提示。一般用於計算引擎規劃並發度等執行計畫。

請求訊息結構

message ComputeSplitPointsBySizeRequest {
    required string table_name = 1;
    required int64 split_size = 2; // in 100MB
    optional int64 split_size_unit_in_byte = 3;
    optional int32 split_point_limit = 4;
}

名稱	類型	是否必選	描述
table_name	string	是	要切分的資料所在的表名。
split_size	int64	是	每個分區的近似大小，以百兆為單位。
split_size_unit_in_byte	int64	否	指定分割大小的單位，以便在分割點計算時使用正確的單位，並確保計算結果的準確性。
split_point_limit	int32	否	指定對分割點數量的限制，以便在進行分割點計算時控制返回的分割點數量。

響應訊息結構

message ComputeSplitPointsBySizeResponse {
    required ConsumedCapacity consumed = 1;
    repeated PrimaryKeySchema schema = 2;

    /**
     * Split points between splits, in the increasing order
     *
     * A split is a consecutive range of primary keys,
     * whose data size is about split_size specified in the request.
     * The size could be hard to be precise.
     *
     * A split point is an array of primary-key column w.r.t. table schema,
     * which is never longer than that of table schema.
     * Tailing -inf will be omitted to reduce transmission payloads.
     */
    repeated bytes split_points = 3;

    /**
     * Locations where splits lies in.
     *
     * By the managed nature of TableStore, these locations are no more than hints.
     * If a location is not suitable to be seen, an empty string will be placed.
     */
     repeated SplitLocation locations = 4;
}

名稱	類型	描述
consumed	ConsumedCapacity	本次請求消耗的服務能力單元。
schema	PrimaryKeySchema	該表的Schema，與建表時的Schema相同。
split_points	repeated bytes	分區之間的分割點。分割點之間保證單調增。每個分割點都是以Plainbuffer編碼的行，並且只有primary-key項。為了減少傳輸的資料量，分割點最後的-inf不會傳輸。
locations	repeated SplitLocation	分割點所在機器的提示。可以為空白。

例如有一張表有三列主鍵，其中首列主鍵類型為string。調用該API後得到5個分區，分別為(-inf,-inf,-inf)到("a",-inf,-inf)、("a",-inf,-inf)到("b",-inf,-inf)、("b",-inf,-inf)到("c",-inf,-inf)、("c",-inf,-inf)到("d",-inf,-inf)和("d",-inf,-inf)到(+inf,+inf,+inf)。前三個落在"machine-A"，後兩個落在"machine-B"。那麼，split_points為（示意）[("a"),("b"),("c"),("d")]，而locations為（示意）"machine-A"*3, "machine-B"*2。

使用SDK

Java SDK：指定大小計算分區

服務能力單元消耗

消耗的讀服務能力單元數量與分區的數量相同。不消耗寫服務能力單元。