Introduction to the Array and Nested data types in the search index - Tablestore

Search indexes support the following primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, Geopoint, and Vector. Search indexes also support the Array and Nested data types. The Array data type is suitable for storing a collection of the same type of data. The Nested data type is similar to the JSON data type and is suitable for storing data that has a hierarchical structure.

Array data type

Important

The Array data type can be used only in search indexes.
The vector data type cannot be used in arrays.

The Array data type is a composite data type and can be combined with primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, and Geopoint, to construct complex data structures. For example, the combination of the Long data type with the Array data type is used to construct arrays of long integers. A long array can contain multiple long integers. If one of the long integers is matched during a query, the corresponding row is returned. The Array data type is suitable for storing a collection of the same type of data.

Array formats

The following table describes the combinations of the Array data type with primitive data types in search indexes.

Combination	Description
Long Array	An array of long integers. Format: `"[1000, 4, 5555]"`.
Double Array	An array of floating-point numbers. Format: `"[3.1415926, 0.99]"`.
Boolean Array	An array of Boolean values. Format: `[true, false]`.
Keyword Array	An array of strings. A keyword array is a JSON array. Example: `"[\"Hangzhou\", \"Xi'an\"]"`.
Text Array	An array of text. A text array is a JSON array. Example: `"[\"Hangzhou\", \"Xi'an\"]"`. Text arrays are not commonly used.
Date Array	An array of date data. Format of date data of the Integer type: `"[1218197720123, 1712850436000]"`. Format of date data of the String type: `"[\"2024-04-11 23:47:16.854775807\", \"2024-06-11 23:47:16.854775807\"]"`.
Geopoint Array	An array of latitude and longitude coordinate pairs. Format: `[\"34.2, 43.0\", \"21.4, 45.2\"]`.

Usage notes

If the data type of a field in a search index is a combination of the Array data type with a primitive data type, such as Long or Double, the field in the data table for which the search index is created must be of the String type, and the field in the search index must be of the corresponding primitive data type. For example, the price field is of the Double Array type. The value of the price field in the data table must be of the String type, the value of the price field in the search index must be of the Double type, and the isArray=true setting must be configured.

Nested data type

Data of the Nested type is nested documents. Nested documents are used when a row of data (document) contains multiple child rows (child documents). Multiple child rows are stored in a nested field. The Nested data type is suitable for storing data that has a hierarchical structure.

You must specify the schema of child rows in a nested field. The schema must include the fields of the child rows and the property of each field. The Nested data type can be used to store multiple values, which is similar to the JSON data type.

Nested formats

Nested fields are classified into single-level and multi-level nested fields. The following table describes the two types.

Type

Description

Single-level nested field

A single-level nested field has a simple data structure that contains only one level. Single-level nested fields are suitable for scenarios in which data structures of multiple levels are not required but hierarchical structures are required. Example:

[
    {
        "tagName": "tag1",
        "score": 0.8
    },
    {
        "tagName": "tag2",
        "score": 0.2
    }
]

Multi-level nested field

A multi-level nested field has a complex data structure that contains multiple levels. Multi-level nested fields are suitable for scenarios in which complex data structures are required to store organized data of various levels in centralized modules. Example:

[
    {
        name:"John",
        "age": 20,
        "phone": "1390000****",
        "address": [
            {
                "province": "Zhejiang",
                "city": "Hangzhou",
                "street": "1201 Xingfu Community, Sunshine Avenue"
            }
        ]
    }
]

Usage notes

If a field in a search index is of the Nested type, the field in the data table for which the search index is created must be of the String type, and the field in the search index must be of the Nested type. You must perform nested queries to query fields of the Nested type.

When you write data to a field in a data table and the field corresponds to a nested field in the search index that is created for the data table, make sure that the field in the data table is of the JSON Array type. Example: [{"tagName":"tag1", "score":0.8,"time": 1730690237000 }, {"tagName":"tag2", "score":0.2,"time": 1730691557000}].

Important

You must write strings of the JSON Array type to a nested field regardless of whether the field contains only one child row.

Examples

Example of single-level nested fields

You can create a single-level nested field in the console or by using an SDK.

This section provides an example on how to create a single-level nested field by using Tablestore SDK for Java. In this example, a nested field named tags is used. Each child row contains three fields. The following figure shows the details. fig_sample

Field name: tagName. Field type: String.
Field name: score. Field type: Double.
Field name: time. Field type: Date. Unit: milliseconds.

The following data is written to the data table: [{"tagName":"tag1", "score":0.8,"time": 1730690237000 }, {"tagName":"tag2", "score":0.2,"time": 1730691557000}].

// Create schemas for the fields in the child row. 
List<FieldSchema> subFieldSchemas = new ArrayList<FieldSchema>();
subFieldSchemas.add(new FieldSchema("tagName", FieldType.KEYWORD)
    .setIndex(true).setEnableSortAndAgg(true));
subFieldSchemas.add(new FieldSchema("score", FieldType.DOUBLE)
    .setIndex(true).setEnableSortAndAgg(true));
subFieldSchemas.add(new FieldSchema("time", FieldType.DATE)
    .setDateFormats(Arrays.asList("epoch_millis")));
    
// Use the schemas that are created for the child rows as the value of subfieldSchemas for the nested field. 
FieldSchema nestedFieldSchema = new FieldSchema("tags", FieldType.NESTED)
    .setSubFieldSchemas(subFieldSchemas);

Example of multiple-level nested fields

You can create a multi-level nested field by using an SDK.

This section provides an example on how to create a multiple-level nested field by using Tablestore SDK for Java. In this example, a nested field named user is used. Each child row contains four fields of different primitive field types and one nested field.

Field name: name. Field type: Keyword.
Field name: age. Field type: Long.
Field name: birth. Field type: Date.
Field name: phone. Field type: Keyword. The value of this field is in the date format.
Nested field name: address. Names of the fields in each child row: province, city, and street. Field type of all fields in each child row: Keyword.

The following data is written to the data table: [ {"name":"John","age":20,"brith":"2014-10-10 12:00:00.000","phone":"1390000****","address":[{"province":"Zhejiang","city":"Hangzhou","street":"1201 Xingfu Community, Sunshine Avenue"}]}].

// Create schemas for the three fields in the child rows of the address nested field. The path specified by user.address can be used to query data of fields in a child row. 
List<FieldSchema> addressSubFiledSchemas = new ArrayList<>();
addressSubFiledSchemas.add(new FieldSchema("province",FieldType.KEYWORD));
addressSubFiledSchemas.add(new FieldSchema("city",FieldType.KEYWORD));
addressSubFiledSchemas.add(new FieldSchema("street",FieldType.KEYWORD));

// Create a schema for each child row of the user nested field. Each child row contains three fields of different primitive field types and one nested field named address. The path specified by user can be used to query data of fields in a child row. 
List<FieldSchema> subFieldSchemas = new ArrayList<>();
subFieldSchemas.add(new FieldSchema("name",FieldType.KEYWORD));
subFieldSchemas.add(new FieldSchema("age",FieldType.LONG));
subFieldSchemas.add(new FieldSchema("birth",FieldType.DATE).setDateFormats(Arrays.asList("yyyy-MM-dd HH:mm:ss.SSS")));
subFieldSchemas.add(new FieldSchema("phone",FieldType.KEYWORD));
subFieldSchemas.add(new FieldSchema("address",FieldType.NESTED).setSubFieldSchemas(addressSubFiledSchemas));

// Use the schemas that are created for the child rows of the user nested field as the value of subfieldSchemas for the nested field. 
List<FieldSchema> fieldSchemas = new ArrayList<>();
fieldSchemas.add(new FieldSchema("user",FieldType.NESTED).setSubFieldSchemas(subFieldSchemas));

Limits

Nested indexes do not support the IndexSort feature, which can be used to improve query performance in various scenarios.
If you use a search index that contains a nested field to query data and require pagination, you must specify the sorting method to return data in the query conditions. Otherwise, Tablestore does not return nextToken when only part of data that meets the query conditions is read.
Nested queries deliver lower performance than other types of queries.

The Nested data type can be used in all queries, sorting, and aggregation.

References

When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, fuzzy query, Boolean query, geo query, nested query, KNN vector query, and exists query. You can select query methods based on your business requirements to query data from multiple dimensions.
You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.
You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.
If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.