You can use a correlation analysis function to quickly find the metrics that are correlated with a specified metric or time series data among multiple observed metrics in the system.
Function list
Function | Description |
ts_association_analysis |
Quickly finds the metrics that are correlated with a specified metric among multiple observed metrics in the system. |
ts_similar |
Quickly finds the metrics that are correlated with specified time series data among multiple observed metrics in the system. |
ts_association_analysis
Function format:
select ts_association_analysis(stamp, params, names, indexName, threshold)
The following table describes the parameters.
Parameter | Description | Value |
stamp | The Unix timestamp. | Long type. |
params | The dimensions of the metrics to be analyzed. | Array of the double type. For example, Latency, QPS, and NetFlow. |
names | The names of the metrics to be analyzed. | Array of the varchar type. For example, Latency, QPS, and NetFlow. |
indexName | The name of the target metric. | Varchar type, for example, Latency. |
threshold | The threshold of correlation between the metrics to be analyzed and the target metric. | Double type. Valid values: [0, 1]. |
Result:
- name: the name of the analyzed metric.
- score: the value of correlation between the analyzed metric and the target metric. Valid values: [0, 1].
Sample code:
* | select ts_association_analysis(
time,
array[inflow, outflow, latency, status],
array['inflow', 'outflow', 'latency', 'status'],
'latency',
0.1) from log;
Sample result:
| results |
| --------------------- |
| ['latency', '1.0'] |
| ['outflow', '0.6265'] |
| ['status', '0.2270'] |
ts_similar
Function format 1:
select ts_similar(stamp, value, ts, ds)
select ts_similar(stamp, value, ts, ds, metricType)
The following table describes the parameters.
Parameter | Description | Value |
stamp | The Unix timestamp. | Long type. |
value | The value of the specified metric. | Double type. |
ts | The sequence of time for the specified curve. | Array of the double type. |
ds | The sequence of numeric data for the specified curve. | Array of the double type. |
metricType | The type of correlation between the measured curves. | Varchar type. Valid values:
SHAPE, RMSE, PEARSON, SPEARMAN, R2, and KENDALL |
Function format 2:
select ts_similar(stamp, value, startStamp, endStamp, step, ds)
select ts_similar(stamp, value, startStamp, endStamp, step, ds, metricType )
The following table describes the parameters.
Parameter | Description | Value |
stamp | The Unix timestamp. | Long type. |
value | The value of the specified metric. | Double type. |
startStamp | The start timestamp of the specified curve. | Long type. |
endStamp | The end timestamp of the specified curve. | Long type. |
step | The time interval between two adjacent points in the sequence of time. | Long type. |
ds | The sequence of numeric data for the specified curve. | Array of the double type. |
metricType | The type of correlation between the measured curves. | Varchar type. Valid values:
SHAPE, RMSE, PEARSON, SPEARMAN, R2, and KENDALL |
Result:
score: the value of correlation between the analyzed metric and the target metric. Valid values: [-1, 1].
Sample code:
* | select vhost, metric, ts_similar(time, value, 1560911040, 1560911065, 5, array[5.1,4.0,3.3,5.6,4.0,7.2], 'PEARSON') from log group by vhost, metric;
Sample result:
| vhost | metric | score |
| ------ | --------------- | -------------------- |
| vhost1 | redolog | -0.3519082537204182 |
| vhost1 | kv_qps | -0.15922168009772697 |
| vhost1 | file_meta_write | NaN |