This topic describes the time series clustering functions that you can use to cluster multiple pieces of time series data and obtain different curve shapes. Then, you can quickly find the corresponding cluster center and curves with shapes that are different from the curve shapes in the cluster.
Function list
Function | Description |
| Uses a density-based clustering method to cluster multiple pieces of time series data. |
| Uses a hierarchical clustering method to cluster multiple pieces of time series data. |
| Queries curves that are similar to a specified curve. |
ts_density_cluster
Function format:
select ts_density_cluster(x, y, z)
The following table lists the parameters of the function.
Parameter | Description | Value |
x | The time sequence. The points in time along the horizontal axis are sorted in ascending order. | Each point in time is a Unix timestamp. Unit: seconds. |
y | The sequence of numeric data corresponding to a specified point in time. | N/A |
z | The name of the curve corresponding to the data at a specified point in time. | The value is of the string type. Example: machine01.cpu_usr. |
Example
The query statement is as follows:
* and (h: "machine_01" OR h: "machine_02" OR h : "machine_03") | select ts_density_cluster(stamp, metric_value,metric_name ) from ( select '("__time__" - ("__time__" % 600))' as stamp, avg(v) as metric_value, h as metric_name from log GROUP BY stamp, metric_name order BY metric_name, stamp )
Output result
The following table lists the display items.
Display item | Description |
cluster_id | The category of the cluster. The value of -1 indicates that the cluster is not categorized in any cluster centers. |
rate | The proportion of instances in the cluster. |
time_series | The timestamp sequence of the cluster center. |
data_series | The data sequence of the cluster center. |
instance_names | The collection of instances included in the cluster center. |
sim_instance | The name of an instance in the cluster. |
ts_hierarchical_cluster
Function format:
select ts_hierarchical_cluster(x, y, z)
The following table lists the parameters of the function.
Parameter | Description | Value |
x | The time sequence. The points in time along the horizontal axis are sorted in ascending order. | Each point in time is a Unix timestamp. Unit: seconds. |
y | The sequence of numeric data corresponding to a specified point in time. | N/A |
z | The name of the curve corresponding to the data at a specified point in time. | The value is of the string type. Example: machine01.cpu_usr. |
Example
The query statement is as follows:
* and (h: "machine_01" OR h: "machine_02" OR h : "machine_03") | select ts_hierarchical_cluster(stamp, metric_value, metric_name) from ( select '("__time__" - ("__time__" % 600))' as stamp, avg(v) as metric_value, h as metric_name from log GROUP BY stamp, metric_name order BY metric_name, stamp )
Output result
The following table lists the display items.
Display item | Description |
cluster_id | The category of the cluster. The value of -1 indicates that the cluster is not categorized in any cluster centers. |
rate | The proportion of instances in the cluster. |
time_series | The timestamp sequence of the cluster center. |
data_series | The data sequence of the cluster center. |
instance_names | The collection of instances included in the cluster center. |
sim_instance | The name of an instance in the cluster. |
ts_similar_instance
Function format:
select ts_similar_instance(x, y, z, instance_name, topK, metricType)
The following table lists the parameters of the function.
Parameter | Description | Value |
x | The time sequence. The points in time along the horizontal axis are sorted in ascending order. | Each point in time is a Unix timestamp. Unit: seconds. |
y | The sequence of numeric data corresponding to a specified point in time. | N/A |
z | The name of the curve corresponding to the data at a specified point in time. | The value is of the string type. Example: machine01.cpu_usr. |
instance_name | The name of a specified curve to be queried in the z collection. | The value is of the string type. Example: machine01.cpu_usr. Note The curve to be queried must be an existing one. |
topK | The curves similar to the specified curve. A maximum of K curves are returned. | N/A |
metricType |
| N/A |
The query statement is as follows:
* and m: NET and m: Tcp and (h: "nu4e01524.nu8" OR h: "nu2i10267.nu8" OR h : "nu4q10466.nu8") | select ts_similar_instance(stamp, metric_value, metric_name, 'nu4e01524.nu8' ) from ( select '("__time__" - ("__time__" % 600))' as stamp, sum(v) as metric_value, h as metric_name from log GROUP BY stamp, metric_name order BY metric_name, stamp )
The following table lists the display items.
Display item | Description |
instance_name | The list of metrics that are similar to the specified metric. |
time_series | The timestamp sequence of the cluster center. |
data_series | The data sequence of the cluster center. |