All Products
Search
Document Center

Tablestore:Data types

Last Updated:Nov 07, 2024

Before you use a search index, you must understand the data types supported by search indexes. You must also understand the mappings between the data types supported by search indexes and the data types supported by the data tables for which the search indexes are created.

Supported data types

Search indexes support the Long, Double, Boolean, Keyword, Text, Date, Geopoint, and Vector basic data types, the Array and Nested special data types, and virtual columns.

Important

Tablestore SDK for Java V5.13.9 or later supports the Date data type in search indexes. To use the Date data type in search indexes, make sure that Tablestore SDK for Java V5.13.9 or later is obtained. For more information about the versions of Tablestore SDK for Java, see Version history of Tablestore SDK for Java.

Basic data types

Search indexes support the following basic data types: Long, Double, Boolean, Keyword, Text, Date, Geopoint, and Vector. The following table describes the basic data types.

Basic data type

Description

Long

A 64-bit long integer.

Double

A 64-bit double-precision floating-point number.

Boolean

A Boolean value.

Keyword

A string that cannot be tokenized.

Text

A string or text that can be tokenized. For more information, see Tokenization.

Date

The Date data type. You can specify the format of data of the Date type. For more information, see Date data type.

Geopoint

A coordinate pair of a geographical location in the latitude,longitude format. Valid values of the latitude: [-90,+90]. Valid values of the longitude: [-180,+180]. Example: 35.8,-45.91.

Vector

The Vector type. The value of a field of the Vector type is a string in the Float32 array format. The length of the array is the same as the number of dimensions of the field. For example, the number of dimensions of the vector string [1, 5.1, 4.7, 0.08] is 4.

Array and Nested types

In addition to basic data types, such as Long, Double, Boolean, Keyword, Text, Date, Geopoint, and Vector, search indexes support two special data types: Array and Nested. The Array data type is suitable for storing a collection of the same type of data. The Nested data type is similar to the JSON data type and is suitable for storing data that has a hierarchical structure. For more information, see Array and Nested data types.

Array type

Important
  • The Array data type can be used only in search indexes.

  • The vector data type cannot be used in arrays.

The Array data type is a composite data type and can be combined with primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, and Geopoint, to construct complex data structures. For example, the combination of the Long data type with the Array data type is used to construct long arrays. A long array can contain multiple long integers. If one of the long integers is matched during a query, the corresponding row is returned. The Array data type is suitable for storing a collection of the same type of data.

If the data type of a field in a search index is a combination of the Array data type with a primitive data type, such as Long or Double, the field in the data table for which the search index is created must be of the String type, and the field in the search index must be of the corresponding primitive data type. For example, the price field is of the Double Array type. The value of the price field in the data table must be of the String type, the value of the price field in the search index must be of the Double type, and the isArray=true setting must be configured.

The following table describes the combinations of the Array data type with primitive data types in search indexes.

Combination

Description

Long Array

An array of long integers. Format: "[1000, 4, 5555]".

Double Array

An array of floating-point numbers. Format: "[3.1415926, 0.99]".

Boolean Array

An array of Boolean values. Format: [true, false].

Keyword Array

An array of strings. A keyword array is a JSON array. Example: "[\"Hangzhou\", \"Xi'an\"]".

Text Array

An array of text. A text array is a JSON array. Example: "[\"Hangzhou\", \"Xi'an\"]".

Text arrays are not commonly used.

Date Array

An array of date data. Format of date data of the Integer type: "[1218197720123, 1712850436000]". Format of date data of the String type: "[\"2024-04-11 23:47:16.854775807\", \"2024-06-11 23:47:16.854775807\"]".

Geopoint Array

An array of latitude and longitude coordinate pairs. Format: [\"34.2, 43.0\", \"21.4, 45.2\"].

Nested type

Data of the Nested type is nested documents. Nested documents are used when a row of data (document) contains multiple child rows (child documents). Multiple child rows are stored in a nested field. The Nested data type is suitable for storing data that has a hierarchical structure.

You must specify the schema of child rows in a nested field. The schema must include the fields of the child rows and the property of each field. The Nested data type can be used to store multiple values, which is similar to the JSON data type.

If a field in a search index is of the Nested type, the field in the data table for which the search index is created must be of the String type, and the field in the search index must be of the Nested type. You must perform nested queries to query fields of the Nested type.

When you write data to a field in a data table and the field corresponds to a nested field in the search index that is created for the data table, make sure that the field in the data table is of the JSON Array type. Example: [{"tagName":"tag1", "score":0.8}, {"tagName":"tag2", "score":0.2}].

Important

You must write strings of the JSON Array type to a nested field regardless of whether the field contains only one child row.

Nested fields are classified into single-level and multi-level nested fields. The following table describes the two types.

Type

Description

Single-level nested field

A single-level nested field has a simple data structure that contains only one level. Single-level nested fields are suitable for scenarios in which data structures of multiple levels are not required but hierarchical structures are required. Example:

[
    {
        "tagName": "tag1",
        "score": 0.8
    },
    {
        "tagName": "tag2",
        "score": 0.2
    }
]

Multi-level nested field

A multi-level nested field has a complex data structure that contains multiple levels. Multi-level nested fields are suitable for scenarios in which complex data structures are required to store organized data of various levels in centralized modules. Example:

[
    {
        name:"John",
        "age": 20,
        "phone": "1390000****",
        "address": [
            {
                "province": "Zhejiang",
                "city": "Hangzhou",
                "street": "1201 Xingfu Community, Sunshine Avenue"
            }
        ]
    }
]

Virtual columns

You can use the virtual column feature of search indexes to query new fields and the data of new field types without the need to modify the storage schema and data in the Tablestore tables. For more information, see Virtual columns.

The virtual column feature allows you to map a column in a table to a virtual column in a search index when you create the search index. The type of the virtual column can be different from that of the column in the table. This allows you to create a column without modifying the table schema and data. The new column can be used to accelerate queries or can be configured with different tokenization methods.

  • You can configure different tokenization methods for Text fields that are mapped to the same field in a table.

    A single String column can be mapped to multiple Text columns of a search index. Different Text columns use different tokenization methods to meet various business requirements.

  • Query acceleration

    You do not need to cleanse data or re-create a table schema. You need to only map required columns of a table to the columns in a search index. The column types can be different between the table and the search index. For example, you can map the numeric type to the Keyword type to improve the performance of a term query, and map the String type to the numeric type to improve the performance of a range query.

Data type mappings

The value of a field in a search index is the value of the field that has the same name in the data table for which the search index is created. The data types of the two values must match. The following table describes the matching rules.

Field data type in search indexes

Field data type in data tables

Long

Integer

Long Array

String

Double

Double

Double Array

String

Boolean

Boolean

Boolean Array

String

Keyword

String

Keyword Array

String

Text

String

Date

Integer and String

Date Array

String

Geopoint

String

Geopoint Array

String

Vector

String

Nested

String