You can perform aggregation operations to obtain the minimum value, maximum value, sum, average value, and count and distinct count of rows. You can also perform aggregation operations to group results by field value, range, geographical location, filter, histogram, or date histogram, and perform nested queries. You can perform multiple aggregation operations for complex queries.
Background information
The following table describes the aggregation methods.
Method | Description |
Minimum value | The aggregation method that can be used to return the minimum value of a field. This method can be used in a similar manner as the SQL MIN function. |
Maximum value | The aggregation method that can be used to return the maximum value of a field. This method can be used in a similar manner as the SQL MAX function. |
Sum | The aggregation method that can be used to return the sum of all values for a numeric field. This method can be used in a similar manner as the SQL SUM function. |
Average value | The aggregation method that can be used to return the average of all values for a numeric field. This method can be used in a similar manner as the SQL AVG function. |
Count | The aggregation method that can be used to return the total number of values for a specific field or the total number of rows in a search index. This method can be used in a similar manner as the SQL COUNT function. |
Distinct count | The aggregation method that can be used to return the number of distinct values for a field. This method can be used in a similar way as the SQL COUNT(DISTINCT) function. |
Group by field value | The aggregation method that can be used to group query results based on field values. The values that are the same are grouped together. The identical value of each group and the number of identical values in each group are returned. Note The calculated number may be different from the actual number if the number of values in a group is very large. |
Group by multiple fields | The aggregation method that can be used to group query results based on multiple fields. You can use tokens to perform paging. |
Group by range | The aggregation method that can be used to group query results based on the value ranges of a field. Field values that are within a specified range are grouped together. The number of values in each range is returned. |
Group by geographical location | The aggregation method that can be used to group query results based on distances from geographical locations to a central point. Query results in distances that are within a specified range are grouped together. The number of items in each range is returned. |
Group by filter | The aggregation method that can be used to filter the query results and group them based on each filter to obtain the number of matching results. Results are returned in the order in which the filters are specified. |
Query by histogram | The aggregation method that can be used to group query results based on specific data intervals. Field values that are within the same range are grouped together. The value range of each group and the number of values in each group are returned. |
Query by date histogram | The aggregation method that can be used to group query results based on specific date intervals. Field values that are within the same range are grouped together. The value range of each group and the number of values in each group are returned. |
Nesting | GroupBy supports nesting. You can perform sub-aggregation operations by using GroupBy. |
Multiple aggregations | You can perform multiple aggregation operations. Note If you perform multiple complex aggregation operations at the same time, a long period of time may be required. |
Prerequisites
An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.
A data table is created and data is written to the data table. For more information, see Create a data table and Write data.
A search index is created for the data table. For more information, see Create search indexes.
Minimum value
The aggregation method that can be used to return the minimum value of a field. This method can be used in a similar manner as the SQL MIN function.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE data types are supported.
Missing
The default value for the field on which the aggregation operation is performed in a row in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
Example
/** * The price of each product is listed in the product table. Query the minimum price of the products that are produced in Zhejiang. * Equivalent SQL statement: SELECT min(column_price) FROM product where place_of_production="Zhejiang"; */ func min(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.TermQuery{"place_of_production", "Zhejiang"}). SetLimit(0). // If you want to obtain only the aggregation results instead of specific data, you can set limit to 0 to improve query performance. Aggregation(search.NewMinAggregation("min_agg_1", "column_price").Missing(0.00))) searchResponse, err := client.Search(searchRequest) // Execute the statement. aggResults := searchResponse.AggregationResults // Obtain the aggregation results. agg1, err := aggResults.Min("min_agg_1") // Obtain the results of the aggregation operation named min_agg_1. if err != nil { panic(err) } if agg1.HasValue() { // Check whether the results of the aggregation operation named min_agg_1 contain values. fmt.Println(agg1.Value) // Display the aggregation results. } }
Maximum value
The aggregation method that can be used to return the maximum value of a field. This method can be used in a similar manner as the SQL MAX function.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE data types are supported.
Missing
The default value for the field on which the aggregation operation is performed in a row in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
Example
/** * The price of each product is listed in the product table. Query the maximum price of the products that are produced in Zhejiang. * Equivalent SQL statement: SELECT max(column_price) FROM product where place_of_production = "Zhejiang Province". */ func max(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.TermQuery{"place_of_production", "Zhejiang"}). SetLimit(0). // If you want to obtain only the aggregation results instead of specific data, you can set limit to 0 to improve query performance. Aggregation(search.NewMaxAggregation("max_agg_1", "column_price").Missing(0.00))) searchResponse, err := client.Search(searchRequest) // Execute the statement. aggResults := searchResponse.AggregationResults // Obtain the aggregation results. agg1, err := aggResults.Max("max_agg_1") // Obtain the results of the aggregation operation named max_agg_1. if err != nil { panic(err) } if agg1.HasValue() { // Check whether the results of the aggregation operation named max_agg_1 contain values. fmt.Println(agg1.Value) // Display the aggregation results. } }
Sum
The aggregation method that can be used to return the sum of all values for a numeric field. This method can be used in a similar manner as the SQL SUM function.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the LONG and DOUBLE data types are supported.
Missing
The default value for the field on which the aggregation operation is performed, which is applied to rows in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
Example
/** * The sales of each product are listed in the product table. Query the total number of the sold products that are produced in Zhejiang. If the sales of a product are empty, 10 is used as the default value. * Equivalent SQL statement: SELECT sum(column_price) FROM product where place_of_production = "Zhejiang". */ func sum(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.TermQuery{"place_of_production", "Zhejiang"}). SetLimit(0). // If you want to obtain only the aggregation results instead of specific data, you can set limit to 0 to improve query performance. Aggregation(search.NewSumAggregation("sum_agg_1", "column_price").Missing(0.00))) searchResponse, err := client.Search(searchRequest) // Execute the statement. aggResults := searchResponse.AggregationResults // Obtain the aggregation results. agg1, err := aggResults.Sum("sum_agg_1") // Obtain the results of the aggregation operation named sum_agg_1. if err != nil { panic(err) } fmt.Println(agg1.Value) // Display the aggregation results. }
Average value
The aggregation method that can be used to return the average of all values for a numeric field. This method can be used in a similar manner as the SQL AVG function.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the LONG, DOUBLE, and DATE data types are supported.
Missing
The default value for the field on which the aggregation operation is performed in a row in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
Example
/** * The sales of each product are listed in the product table. Query the average price of the products that are produced in Zhejiang. * EquivalentSQL statement: SELECT avg(column_price) FROM product where place_of_production = "Zhejiang Province". */ func avg(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.TermQuery{"place_of_production", "Zhejiang"}). SetLimit(0). // If you want to obtain only the aggregation results instead of specific data, you can set limit to 0 to improve query performance. Aggregation(search.NewAvgAggregation("avg_agg_1", "column_price").Missing(0.00))) searchResponse, err := client.Search(searchRequest) // Execute the statement. aggResults := searchResponse.AggregationResults // Obtain the aggregation results. agg1, err := aggResults.Avg("avg_agg_1") // Obtain the results of the aggregation operation named avg_agg_1. if err != nil { panic(err) } if agg1.HasValue() { // Check whether the results of the aggregation operation named agg1 contain values. fmt.Println(agg1.Value) // Display the aggregation results. } }
Count
The aggregation method that can be used to return the total number of values for a specific field or the total number of rows in a search index. This method can be used in a similar manner as the SQL COUNT function.
You can use the following methods to query the total number of rows in a search index or the total number of rows that meet the query conditions:
Use the count feature of aggregation and specify count(*) in the request.
Use the query feature to obtain the number of rows that meet the query conditions. Set setGetTotalCount to true in the query. Use MatchAllQuery to obtain the total number of rows in a search index.
You can use the name of a column as the value of the count expression to query the number of rows that contain the column in a search index. This method is suitable for scenarios that involve sparse columns.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the following data types are supported: LONG, DOUBLE, BOOLEAN, KEYWORD, GEO_POINT, and DATE.
Example
/** * Penalty records of merchants are recorded in the merchant table. You can query the number of merchants who are located in Zhejiang and for whom penalty records exist. If no penalty record exists for a merchant, the field that corresponds to penalty records also does not exist for the merchant. * EquivalentSQL statement: SELECT count(column_history) FROM product where place_of_production = "Zhejiang Province". */ func count(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.TermQuery{"place_of_production", "Zhejiang"}). SetLimit(0). // If you want to obtain only the aggregation results instead of specific data, you can set limit to 0 to improve query performance. Aggregation(search.NewCountAggregation("count_agg_1", "column_price"))) searchResponse, err := client.Search(searchRequest) // Execute the statement. aggResults := searchResponse.AggregationResults // Obtain the aggregation results. agg1, err := aggResults.Count("count_agg_1") // Obtain the results of the aggregation operation named count_agg_1. if err != nil { panic(err) } fmt.Println(agg1.Value) // Display the aggregation results. }
Distinct count
The aggregation method that can be used to return the number of distinct values for a field. This method can be used in a similar way as the SQL COUNT(DISTINCT) function.
The number of distinct values is an approximate number.
If the total number of rows before the distinct count feature is used is less than 10,000, the calculated result is close to the exact value.
If the total number of rows before the distinct count feature is used is greater than or equal to 100 million, the error rate is approximately 2%.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the following data types are supported: LONG, DOUBLE, BOOLEAN, KEYWORD, GEO_POINT, and DATE.
Missing
The default value for the field on which the aggregation operation is performed, which is applied to rows in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
Example
/** * Query the number of distinct provinces from which the products are produced. * EquivalentSQL statement: SELECT count(distinct column_place) FROM product. */ func distinctCount(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.TermQuery{"place_of_production", "Zhejiang"}). SetLimit(0). // If you want to obtain only the aggregation results instead of specific data, you can set limit to 0 to improve query performance. Aggregation(search.NewDistinctCountAggregation("distinct_count_agg_1", "column_price").Missing(0.00))) searchResponse, err := client.Search(searchRequest) // Execute the statement. aggResults := searchResponse.AggregationResults // Obtain the aggregation results. agg1, err := aggResults.DistinctCount("distinct_count_agg_1") // Obtain the results of the aggregation operation named distinct_count_agg_1. if err != nil { panic(err) } fmt.Println(agg1.Value) // Display the aggregation results. }
Group by field value
The aggregation method that can be used to group query results based on field values. The values that are the same are grouped together. The identical value of each group and the number of identical values in each group are returned.
The calculated number may be different from the actual number if the number of values in a group is very large.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the following data types are supported: LONG, DOUBLE, BOOLEAN, KEYWORD, and DATE.
Size
The number of groups that you want to return. Default value: 10. Maximum value: 2000. If the number of groups exceeds 2,000, only the first 2,000 groups are returned.
GroupBySorters
The sorting rules for items in a group. By default, group items are sorted in descending order. If you configure multiple sorting rules, data is sorted based on the order in which the rules are configured. The following sorting rules are supported:
Sort by value in alphabetical order.
Sort by value in reverse alphabetical order
Sort by row count in ascending order
Sort by row count in descending order
Sort by the values that are obtained from sub-aggregation results in ascending order
Sort by the values that are obtained from sub-aggregation results in descending order.
SubAggregation and SubGroupBy
The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.
Scenario
Query the number of products in each category, and the maximum and minimum product prices in each category.
Method
Group query results by product category to obtain the number of products in each category. Then, perform two sub-aggregation operations to obtain the maximum and minimum product prices in each category.
Sample results
Fruits: 5. The maximum value of the price is 15. The minimum value of the price is 3.
Toiletries: 10. The maximum value of the price is 98. The minimum value of the price is 1.
Electronic devices: 3. The maximum value of the price is 8,699. The minimum value of the price is 2,300.
Other products: 15. The maximum value of the price is 1,000. The minimum value of the price is 80.
Example
/** * Query the number of products, and the maximum and minimum product prices in each category. * Example of returned results: Fruits: 5. The maximum value of the price is 15, and the minimum value of the price is 3. Toiletries: 10. The maximum value of the price is 98, and the minimum value of the price is 1. Electronic devices: 3. The maximum value of the price is 8699, and the minimum value of the price is 2300. * Other products: 15. The maximum value of the price is 1000, and the minimum value of the price is 80. */ func GroupByField(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByField("group1", "column_type"). // Group products by category. SubAggregation(search.NewMinAggregation("min_price", "column_price")). // Query the minimum prices in each category. SubAggregation(search.NewMaxAggregation("max_price", "column_price")))) // Query the maximum prices in each category. searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregation results. group, err := groupByResults.GroupByField("group1") if err != nil { fmt.Printf("%#v", err) return } for _, item := range group.Items { // Traverse the groups that are returned. // Display the value of each group and the number of rows in each group. fmt.Println("\tkey: ", item.Key, ", rowCount: ", item.RowCount) // Display the lowest prices. minPrice, _ := item.SubAggregations.Min("min_price") if minPrice.HasValue() { fmt.Println("\t\tmin_price: ", minPrice.Value) } // Display the highest prices. maxPrice, _ := item.SubAggregations.Max("max_price") if maxPrice.HasValue() { fmt.Println("\t\tmax_price: ", maxPrice.Value) } } }
Group by multiple fields
The aggregation method that can be used to group query results based on multiple fields. You can use tokens to perform paging.
If you want to group query results based on multiple fields, you can use the groupBy parameter in nested mode or use the GroupByComposite parameter. For information about the difference between the groupBy and GroupByComposite parameters, see Appendix: different methods for multi-field grouping.
Parameters
Parameter
Description
GroupByName
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
SourceGroupByList
The fields by which you want to group query results. You can group query results by up to 32 fields and perform aggregation operations on the resulting groups. The following group types are supported:
ImportantFor group types specified in the SourceGroupByList parameter, you can set the GroupBySorter parameter to only groupKeySort.
By default, the groups are sorted in descending order.
If a field value in a column does not exist, the system returns NULL.
For the GroupByField type, you can configure only the following parameters: GroupByName, FieldName, and GroupBySorter.
For the GroupByHistogram type, you can configure only the following parameters: GroupByName, FieldName, Interval, and GroupBySorter.
For the GroupByDateHistogram type, you can configure only the following parameters: GroupByName, FieldName, Interval, TimeZone, and GroupBySorter.
NextToken
The pagination token that is used in the next request to retrieve a new page of groups. You can obtain the value of the NextToken parameter from the output of the GroupByCompositeResult operation. The NextToken parameter allows you to obtain all the grouping results.
Size
The number of groups per page. If the number of groups that meet the requirements exceeds the value of the Size parameter, you must use the NextToken parameter to obtain the groups on the next page.
ImportantIf you want to limit the number of groups to return, you cannot configure the Size and SuggestedSize parameters at the same time. In most cases, we recommend that you configure the Size parameter.
In scenarios where you want to interconnect Tablestore with a high-throughput computing engine such as Apache Spark or PrestoSQL, we recommend that you configure the SuggestedSize parameter.
SuggestedSize
The expected number of groups per page. You can set a value greater than the maximum number of groups allowed at the server side or -1. The server side returns the actual number of groups based on its capacity. This parameter is applicable in scenarios where you interconnect Tablestore with a high-throughput computing engine, such as Apache Spark or PrestoSQL.
If you set this parameter to a value that is greater than the maximum number of groups allowed at the server side, the system adjusts the value to the maximum number of groups allowed at the server side. The actual number of groups that are returned equals
min(suggestedSize, maximum number of groups allowed in the server side, total number of groups)
.SubAggList and SubGroupByList
The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.
ImportantThe GroupByComposite type is not supported in the SubGroupByList parameter.
Example
/** * Group and aggregate query results: Group the query results and perform the aggregation operation based on the parameters such as groupbyField, groupByHistogram, and groupByDataHistogram that are passed to the SourceGroupBy parameter. * Return the aggregated results of multiple fields in a flat structure. */ func groupByComposite(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. request.setTableName(tableName); // Specify the name of the data table. request.setIndexName(indexName); // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByComposite("groupByComposite"). SourceGroupBys(search.NewGroupByField("groupByField_1", "Col_Long"), search.NewGroupByField("groupByField_2", "Col_Keyword"), search.NewGroupByField("groupByField_3", "Col_Double")). SetSize(10))) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregate results. group, err := groupByResults.GroupByComposite("group1") if err != nil { fmt.Printf("%#v", err) return } fmt.Println("groupName: ", group.Name) for _, item := range group.Items { // Traverse the groups that are returned. // Display the value of each group and the number of rows in each group. fmt.Printf("\tkey:") if item.Keys != nil { for i := 0; i < len(item.Keys); i++ { fmt.Printf(" %v ", *item.Keys[i]) } fmt.Printf("\trowCount:%v\n", item.RowCount) } } }
Group by range
The aggregation method that can be used to group query results based on the value ranges of a field. Field values that are within a specified range are grouped together. The number of values in each range is returned.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used to perform the aggregation operation. Only the LONG and DOUBLE data types are supported.
Range(fromInclusive float64, toExclusive float64)
The value ranges that are used for grouping.
The range can start from NegInf and end with Inf.
SubAggregation and SubGroupBy
The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.
For example, after you group query results by sales and by province, you can obtain the province that has the largest proportion of sales in a specified range. To do this, you need to specify GroupByField in GroupByRange.
Example
/** * Group sales volumes based on ranges [NegInf, 1000), [1000, 5000), and [5000, Inf) to obtain the sales volume in each range. */ func GroupByRange(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByRange("group1", "column_number"). Range(search.NegInf, 1000). Range(1000, 5000). Range(5000, search.Inf))) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregation results. group, err := groupByResults.GroupByRange("group1") if err != nil { fmt.Printf("%#v", err) return } for _, item := range group.Items { // Traverse the groups that are returned. fmt.Println("\t[", item.From, ", ", item.To, "), rowCount: ", item.RowCount) // Display the number of rows in each group. } }
Group by geographical location
The aggregation method that can be used to group query results based on distances from geographical locations to a central point. Query results in distances that are within a specified range are grouped together. The number of items in each range is returned.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
FieldName
The name of the field that is used for the aggregation operation. Only the GEO_POINT data type is supported.
CenterPoint(latitude float64, longitude float64)
The longitude and latitude of the central point.
latitude specifies the latitude of the central point. longitude specifies the longitude of the central point.
Range(fromInclusive float64, toExclusive float64)
The distance ranges that are used for grouping. Unit: meters.
The range can start from NegInf and end with Inf.
SubAggregation and SubGroupBy
The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.
Example
/** * Group users based on the distance from their geographical locations to a Wanda Plaza to obtain the number of users in each distance range. The distance ranges are [NegInf, 1000), [1000, 5000), and [5000, Inf). Unit: meters. */ func GroupByGeoDistance(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByGeoDistance("group1", "Col_GeoPoint", search.GeoPoint{Lat: 30.137817, Lon:120.08681}). Range(search.NegInf, 1000). Range(1000, 5000). Range(5000, search.Inf))) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregation results. group, err := groupByResults.GroupByGeoDistance("group1") if err != nil { fmt.Printf("%#v", err) return } for _, item := range group.Items { // Traverse the groups that are returned. fmt.Println("\t[", item.From, ", ", item.To, "), rowCount: ", item.RowCount) // Display the number of rows in each group. } }
Group by filter
The aggregation method that can be used to filter the query results and group them based on each filter to obtain the number of matching results. Results are returned in the order in which the filters are specified.
Parameters
Parameter
Description
Name
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
Query
The filters that can be used for the query. Results are returned in the order in which the filter conditions are specified.
SubAggregation and SubGroupBy
The sub-aggregation operation. You can perform the sub-aggregation operation based on the grouping results.
Example
/** * Specify the following filters to obtain the number of items that match each filter: The sales exceed 100, the place of origin is Zhejiang, and the description contains Hangzhou. */ func GroupByFilter(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByFilter("group1"). Query(&search.RangeQuery{ FieldName: "number", From: 100, IncludeLower: true}). Query(&search.TermQuery{ FieldName: "place", Term: "Zhejiang", }). Query(&search.MatchQuery{ FieldName: "description", Text: "Hangzhou", }))) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregation results. group, err := groupByResults.GroupByFilter("group1") if err != nil { fmt.Printf("%#v", err) return } for _, item := range group.Items { // Traverse the groups that are returned. fmt.Println("\trowCount: ", item.RowCount) // Display the number of rows in each group. } }
Group by histogram
The aggregation method that can be used to group query results based on specific data intervals. Field values that are within the same range are grouped together. The value range of each group and the number of values in each group are returned.
Parameters
Parameter
Description
GroupByName
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
Field
The name of the field that is used to perform the aggregation operation. Only the LONG and DOUBLE data types are supported.
Interval
The date interval that is used to obtain aggregation results.
FieldRange[min,max]
The range that is used together with the Interval parameter to limit the number of groups. The value that is calculated by using the
(FieldRange.max-FieldRange.min)/interval
formula cannot exceed 2000.MinDocCount
The minimum number of rows. If the number of rows in a group is less than the minimum number of rows, the aggregation results for the group are not returned.
Missing
The default value for the field on which the aggregation operation is performed, which is applied to rows in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
Example
func GroupByHistogram(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. request.setTableName(tableName); // Specify the name of the data table. request.setIndexName(indexName); // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByHistogram("group1", "field_name"). SetMinDocCount(1). SetFiledRange(1, 100). SetMissing(3). SetInterval(10))) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregate results. group, err := groupByResults.GroupByHistogram("group1") if err != nil { fmt.Printf("%#v", err) return } for _, item := range group.Items { // Traverse the groups that are returned. fmt.Println("key:", item.Key.Value, ", rowCount:", item.Value) } }
Group by date histogram
The aggregation method that can be used to group query results based on specific date intervals. Field values that are within the same range are grouped together. The value range of each group and the number of values in each group are returned.
Parameters
Parameter
Description
GroupByName
The unique name of the aggregation operation. You can query the results of a specific aggregation operation based on this name.
Field
The name of the field that is used to perform the aggregation operation. Only the DATE data type is supported.
Interval
The date interval that is used to obtain aggregation results.
FieldRange[min,max]
The range that is used together with the Interval parameter to limit the number of groups. The value that is calculated by using the
(FieldRange.max-FieldRange.min)/interval
formula cannot exceed 2000.MinDocCount
The minimum number of rows. If the number of rows in a group is less than the minimum number of rows, the aggregation results for the group are not returned.
Missing
The default value for the field on which the aggregation operation is performed, which is applied to rows in which the field value is empty.
If you do not specify a value for Missing, the row is ignored.
If you specify a value for Missing, the value of this parameter is used as the field value of the row.
TimeZone
The time zone in the
+hh:mm
or-hh:mm
format, such as+08:00
or-09:00
. This parameter is required only when the field is of the DATE data type.If you do not specify this parameter for fields of the DATE data type, an offset of N hours may occur in the aggregation results. You can specify this parameter to resolve this error.
Example
func GroupByDateHistogram(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. request.setTableName(tableName); // Specify the name of the data table. request.setIndexName(indexName); // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). GroupBy(search.NewGroupByDateHistogram("date_group", "date_field_name"). SetMinDocCount(1). SetFiledRange("2017-05-01 10:00", "2017-05-21 13:00:00"). SetMissing("2017-05-01 13:01:00"). SetInterval(model.DateTimeValue{ Value: proto.Int32(1), Unit: model.DateTimeUnit_DAY.Enum(), }))) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } groupByResults := searchResponse.GroupByResults // Obtain the aggregate results. group, err := groupByResults.GroupByDateHistogram("date_group") if err != nil { fmt.Printf("%#v", err) return } for _, item := range group.Items { // Traverse the groups that are returned. fmt.Printf("millisecondTimestamp:%d , rowCount:%d \n", item.Timestamp, item.RowCount) } }
Nesting
GroupBy supports nesting. You can perform sub-aggregation operations by using GroupBy.
You can use nesting to perform sub-aggregation operations in a group. The following examples show aggregation operations with two levels of nesting.
GroupBy + SubGroupBy: Items are grouped by province and by city to obtain data for each city in each province.
GroupBy + SubAggregation: Items are grouped by province to obtain the maximum value of a metric for each province.
To ensure high performance of complex GroupBy operations, you can specify only a small number of levels for nesting. For more information, see Search index limits.
Example
/**
* Perform nesting-based aggregation.
* Two aggregations and one GroupByRange attribute are specified in the outermost level of GroupByField.
*/
func NestedSample(client *tablestore.TableStoreClient, tableName string, indexName string) {
searchRequest := &tablestore.SearchRequest{}
searchRequest.
SetTableName(tableName). // Specify the name of the data table.
SetIndexName(indexName). // Specify the name of the search index.
SetSearchQuery(search.NewSearchQuery().
SetQuery(&search.MatchAllQuery{}). // Match all rows.
SetLimit(0).
GroupBy(search.NewGroupByField("group1", "field1").
SubAggregation(search.NewMinAggregation("sub_agg1", "sub_field1")).
SubAggregation(search.NewMaxAggregation("sub_agg2", "sub_field2")).
SubGroupBy(search.NewGroupByRange("sub_group1", "sub_field3").
Range(search.NegInf, 3).
Range(3, 5).
Range(5, search.Inf))))
searchResponse, err := client.Search(searchRequest)
if err != nil {
fmt.Printf("%#v", err)
return
}
groupByResults := searchResponse.GroupByResults // Obtain the aggregation results.
group, err := groupByResults.GroupByField("group1")
if err != nil {
fmt.Printf("%#v", err)
return
}
for _, item := range group.Items { // Traverse the groups that are returned.
// Display the value of each group and the number of rows in each group.
fmt.Println("\tkey: ", item.Key, ", rowCount: ", item.RowCount)
// Obtain the results of the aggregation operation named sub_agg1.
subAgg1, _ := item.SubAggregations.Min("sub_agg1")
if subAgg1.HasValue() {
fmt.Println("\t\tsub_agg1: ", subAgg1.Value)
}
// Obtain the results of the aggregation operation named sub_agg2.
subAgg2, _ := item.SubAggregations.Max("sub_agg2")
if subAgg2.HasValue() {
fmt.Println("\t\tsub_agg2: ", subAgg2.Value)
}
// Obtain the results of the aggregation operation named sub_group1.
subGroup, _ := item.SubGroupBys.GroupByRange("sub_group1")
for _, item := range subGroup.Items {
fmt.Println("\t\t[", item.From, ", ", item.To, "), rowCount: ", item.RowCount)
}
}
}
Multiple aggregations
You can perform multiple aggregation operations.
If you perform multiple complex aggregation operations at the same time, a long period of time may be required.
Example 1
func MultipleAggregations(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). Aggregation(search.NewAvgAggregation("agg1", "Col_Long")). // Calculate the average value of the Col_Long field. Aggregation(search.NewDistinctCountAggregation("agg2", "Col_Long")). // Calculate the number of distinct values of the Col_Long field. Aggregation(search.NewMaxAggregation("agg3", "Col_Long")). // Query the maximum value of the Col_Long field. Aggregation(search.NewSumAggregation("agg4", "Col_Long")). // Calculate the sum of values of the Col_Long field. Aggregation(search.NewCountAggregation("agg5", "Col_Long"))) // Calculate the number of rows that contain the Col_Long field. // Return all columns. searchRequest.SetColumnsToGet(&tablestore.ColumnsToGet{ ReturnAll: true, }) searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } aggResults := searchResponse.AggregationResults // Obtain the aggregation results. // Obtain the average value obtained from the aggregation operation. agg1, err := aggResults.Avg("agg1") // Obtain the results of the aggregation operation named agg1. if err != nil { panic(err) } if agg1.HasValue() { // Check whether the results of the aggregation operation named agg1 contain values. fmt.Println("(avg) agg1: ", agg1.Value) // Display the average value of the Col_Long field. } else { fmt.Println("(avg) agg1: no value") // Display the rows that contain no values for the Col_Long field. } // Obtain the number of distinct values from the results of the aggregation operation. agg2, err := aggResults.DistinctCount("agg2") // Obtain the results of the aggregation operation named agg2. if err != nil { panic(err) } fmt.Println("(distinct) agg2: ", agg2.Value) // Display the number of distinct values for the Col_Long field. // Display the maximum value obtained from the aggregation operation. agg3, err := aggResults.Max("agg3") // Obtain the results of the aggregation operation named agg3. if err != nil { panic(err) } if agg3.HasValue() { fmt.Println("(max) agg3: ", agg3.Value) // Display the maximum value for the Col_Long field. } else { fmt.Println("(max) agg3: no value") // Display the rows that contain no values for the Col_Long field. } // Obtain the sum from the results of the aggregation operation. agg4, err := aggResults.Sum("agg4") // Obtain the results of the aggregation operation named agg4. if err != nil { panic(err) } fmt.Println("(sum) agg4: ", agg4.Value) // Display the sum for the Col_Long field. // Obtain the number of rows from the results of the aggregation operation. agg5, err := aggResults.Count("agg5") // Obtain the results of the aggregation operation named agg5. if err != nil { panic(err) } fmt.Println("(count) agg6: ", agg5.Value) // Display the number of values for the Col_Long field. }
Example 2
func MultipleAggregationsAndGroupBysSample(client *tablestore.TableStoreClient, tableName string, indexName string) { searchRequest := &tablestore.SearchRequest{} searchRequest. SetTableName(tableName). // Specify the name of the data table. SetIndexName(indexName). // Specify the name of the search index. SetSearchQuery(search.NewSearchQuery(). SetQuery(&search.MatchAllQuery{}). // Match all rows. SetLimit(0). Aggregation(search.NewAvgAggregation("agg1", "Col_Long")). // Calculate the average value of the Col_Long field. Aggregation(search.NewDistinctCountAggregation("agg2", "Col_Long")). // Calculate the number of distinct values of the Col_Long field. Aggregation(search.NewMaxAggregation("agg3", "Col_Long")). // Query the maximum value of the Col_Long field. GroupBy(search.NewGroupByField("group1", "Col_Keyword"). // Group the rows by field value based on the Col_Keyword field. GroupBySorters([]search.GroupBySorter{}). // Specify the method that is used to sort the groups that are returned. Size(2). // Specify that only the first two groups are returned. SubAggregation(search.NewAvgAggregation("sub_agg1", "Col_Long")). // Perform sub-aggregation operations on each group. SubGroupBy(search.NewGroupByField("sub_group1", "Col_Keyword2"))). // Perform sub-aggregation operations on each group. GroupBy(search.NewGroupByRange("group2", "Col_Long"). // Group the rows by range based on the Col_Long field. Range(search.NegInf, 3). // The first group contains rows whose values in the Col_Long field are within the (NegInf, 3) range. Range(3, 5). // The second group contains rows whose values in the Col_Long field are within the [3, 5) range. Range(5, search.Inf))) // The third group contains rows whose values in the Col_Long field are within the [5, Inf) range. // Return all columns. searchResponse, err := client.Search(searchRequest) if err != nil { fmt.Printf("%#v", err) return } aggResults := searchResponse.AggregationResults // Obtain the aggregation results. // Obtain the average value obtained from the aggregation operation. agg1, err := aggResults.Avg("agg1") // Obtain the results of the aggregation operation named agg1. if err != nil { panic(err) } if agg1.HasValue() { // Check whether the results of the aggregation operation named agg1 contain values. fmt.Println("(avg) agg1: ", agg1.Value) // Display the average value of the Col_Long field. } else { fmt.Println("(avg) agg1: no value") // Display the rows that contain no values for the Col_Long field. } // Obtain the number of distinct values from the results of the aggregation operation. agg2, err := aggResults.DistinctCount("agg2") // Obtain the results of the aggregation operation named agg2. if err != nil { panic(err) } fmt.Println("(distinct) agg2: ", agg2.Value) // Display the number of distinct values for the Col_Long field. // Display the maximum value obtained from the aggregation operation. agg3, err := aggResults.Max("agg3") // Obtain the results of the aggregation operation named agg3. if err != nil { panic(err) } if agg3.HasValue() { fmt.Println("(max) agg3: ", agg3.Value) // Display the maximum value for the Col_Long field. } else { fmt.Println("(max) agg3: no value") // Display the rows that contain no values for the Col_Long field. } groupByResults := searchResponse.GroupByResults // Obtain the aggregation results. // Obtain the results of the GroupByField aggregation operation. group1, err := groupByResults.GroupByField("group1") // Obtain the results of the aggregation operation named group1. if err != nil { panic(err) } fmt.Println("group1: ") for _, item := range group1.Items { // Traverse all groups that are returned. //item fmt.Println("\tkey: ", item.Key, ", rowCount: ", item.RowCount) // Display the number of rows in each group. // Obtain the average value obtained from the sub-aggregation operation. subAgg1, err := item.SubAggregations.Avg("sub_agg1") // Obtain the results of the sub-aggregation operation named sub_agg1. if err != nil { panic(err) } if subAgg1.HasValue() { // If sub_agg1 obtains the average value for the Col_Long field, HasValue() is set to true. fmt.Println("\t\tsub_agg1: ", subAgg1.Value) // Display the average value for the Col_Long field obtained from the sub-aggregation operation. } // Obtain the results of the GroupByField sub-aggregation operation. subGroup1, err := item.SubGroupBys.GroupByField("sub_group1") // Obtain the results of the sub-aggregation operation named sub_group1. if err != nil { panic(err) } fmt.Println("\t\tsub_group1") for _, subItem := range subGroup1.Items { // Traverse the results of the sub-aggregation operation named sub_group1. fmt.Println("\t\t\tkey: ", subItem.Key, ", rowCount: ", subItem.RowCount) // Display the number of rows in each group that is obtained from the results of the sub-aggregation operation named sub_group1. tablestore.Assert(subItem.SubAggregations.Empty(), "") tablestore.Assert(subItem.SubGroupBys.Empty(), "") } } // Obtain the results of the GroupByRange aggregation operation. group2, err := groupByResults.GroupByRange("group2") // Obtain the results of the aggregation operation named group2. if err != nil { panic(err) } fmt.Println("group2: ") for _, item := range group2.Items { // Traverse all groups that are returned. fmt.Println("\t[", item.From, ", ", item.To, "), rowCount: ", item.RowCount) // Display the number of rows in each group. } }
Appendix: different methods for multi-field grouping
If you want to group query results by multiple fields, you can use the groupBy parameter in nested mode or use the GroupByComposite parameter. The following table describes the difference between the groupBy parameter in nested mode and the groupByComposite parameter.
Feature | groupBy (nested) | groupByComposite |
size | 2000 | 2000 |
Limits on fields | Up to 5 layers supported | Up to 32 fields supported |
Pagination | Not supported | Supported by using the nextToken parameter |
Sort group items. |
| In alphabetical order or reverse alphabetical order |
Supports aggregation | Yes | Yes |
Compatibility | For fields of the DATE type, the query results are returned in the specified format. | For fields of the DATE type, the query results are returned as timestamp strings. |