Before you perform operations to implement load balancing on your distributed system, you must measure the load balancing degree of the system in an accurate manner. This topic describes the syntax of the load balancing measurement function. This topic also provides examples on how to use the function.
Background information
The sample log for which the load balancing measurement function is used contains the following indexed fields: cluster ID, server ID, time, CPU load, memory load, and network bandwidth load. For more information, see Create indexes.
Sample log:
{"cluster_id":"C001","cpu_load":"0.1","network_load":"0.6","ram_load":"0.7","server_id":"S001","time_period":"2024-01-01 00:00:00"}
{"cluster_id":"C001","cpu_load":"0.2","network_load":"0.5","ram_load":"0.8","server_id":"S002","time_period":"2024-01-01 00:01:00"}
{"cluster_id":"C001","cpu_load":"0.1","network_load":"0.6","ram_load":"0.7","server_id":"S001","time_period":"2024-01-01 00:02:00"}
{"cluster_id":"C001","cpu_load":"0.2","network_load":"0.5","ram_load":"0.8","server_id":"S002","time_period":"2024-01-01 00:03:00"}
{"cluster_id":"C001","cpu_load":"0.1","network_load":"0.6","ram_load":"0.7","server_id":"S001","time_period":"2024-01-01 00:04:00"}
{"cluster_id":"C001","cpu_load":"0.2","network_load":"0.5","ram_load":"0.8","server_id":"S002","time_period":"2024-01-01 00:05:00"}
{"cluster_id":"C001","cpu_load":"0.1","network_load":"0.6","ram_load":"0.7","server_id":"S001","time_period":"2024-01-01 00:06:00"}
{"cluster_id":"C001","cpu_load":"0.2","network_load":"0.5","ram_load":"0.8","server_id":"S002","time_period":"2024-01-01 00:07:00"}
{"cluster_id":"C001","cpu_load":"0.1","network_load":"0.6","ram_load":"0.7","server_id":"S001","time_period":"2024-01-01 00:08:00"}
{"cluster_id":"C001","cpu_load":"0.2","network_load":"0.5","ram_load":"0.8","server_id":"S002","time_period":"2024-01-01 00:09:00"}
Load balancing measurement function
Function | Syntax | Description | Data type of the return value |
how_balanced(array(array(double)) load_matrix) | Measures the load balancing degree of your distributed system. You must use this function together with the array_agg function. For more information, see array_agg function. The return value indicates the load balancing degree. Valid values: | double |
how_balanced function
The how_balanced function measures the load balancing degree of your distributed system. You must use this function together with the array_agg function. For more information, see array_agg function.
double how_balanced(array(array(double)) load_matrix)
Parameter | Description |
| The load matrix. Each row specifies the load time sequence vector of a server. |
Example
Query statement
* | with server_time_series as ( select cluster_id, server_id, array_agg(to_unixtime(date_parse(time_period, '%Y-%m-%d %H:%i:%s'))) as time_periods, array_agg(cpu_load + ram_load + network_load) as metric_values from log where time_period >= '2024-01-01 00:00:00' and time_period < '2024-01-02 00:00:00' group by cluster_id, server_id ), imputed_server_series as ( select cluster_id, server_id, ts_fill_missing( time_periods, metric_values, to_unixtime(date_parse('2024-01-01 00:00:00', '%Y-%m-%d %H:%i:%s')), to_unixtime(date_parse('2024-01-02 00:00:00', '%Y-%m-%d %H:%i:%s')), '1 minute', 'value=0') as imputed_time_series from server_time_series ) select cluster_id, how_balanced(array_agg(imputed_time_series[2])) as balance from imputed_server_series group by cluster_id
Query and analysis results
balance
indicates the load balancing degree. Valid values:(0,1]
. A value that is closer to 1 indicates a high degree of load balancing. A value of 1 indicates full load balancing.cluster_id
balance
G1
0.5