Application schema
The data that you upload to an instance of Industry Algorithm Edition is first stored in offline data tables. To facilitate data upload, Industry Algorithm Edition allows you to create multiple data tables based on your business requirements and provides data processing plug-ins. If you create multiple data tables, you must specify the fields that are associated with the tables. After the data in multiple data tables is processed, the tables are joined to form an index table. The index table defines search attributes and can be used by the search engine to build indexes and search for data.
Fields in a data table
You can use data tables to import data to OpenSearch. Each data processing plug-in can process only specific types of fields. For more information about the limits on field values, see the "Limits on fields" section of the Limits topic. If a field value is not in the specified value range, an overflow error occurs or the value is truncated. You must select correct field types.
Field type | Description |
INT | 64-bit integer. |
INT_ARRAY | 64-bit integer array. |
FLOAT | Floating-point number. |
FLOAT_ARRAY | Floating-point number array. |
DOUBLE | Floating-point number. |
DOUBLE_ARRAY | Floating-point number array. |
LITERAL | String constant. A string constant supports only exact match. |
LITERAL_ARRAY | String constant array. A single element in a string constant array supports only exact match. |
SHORT_TEXT | Short text. A value of this type cannot exceed 100 bytes in length. This type of field supports multiple analysis methods. |
TEXT | Long text. This type of field supports multiple analysis methods. |
TIMESTAMP | 64-bit unsigned integer. This type of field indicates a timestamp. |
GEO_POINT | String constant. This type of field indicates a pair of latitude and longitude in the "Latitude value Longitude value" format. |
Usage notes on reserved fields:
The following field names are reserved and cannot be used when you specify the filed names: 'service_id', 'ops_app_name', 'inter_timestamp', 'index_name', 'pk', 'ops_version', 'ha_reserved_timestamp', and 'summary'.
Usage notes on fields of array types:
After you create a field of an array type in an application, you can map the field to a field of the STRING type such as VARCHAR in a data source. In addition, you can use a data processing plug-in to process the field. For more information, see Use data processing plug-ins.
If you use the OpenSearch Industry Algorithm Edition API or SDKs to upload a field of an array type, upload the field as an array instead of a string. Example: String[] literal_array = {"Alibaba Cloud","OpenSearch"};
Usage notes on fields of timestamp types:
Fields of the INT and TIMESTAMP types can be mapped to a field of the DATETIME or TIMESTAMP type in a data source. The field values are automatically converted to the number of milliseconds. You can use the range function to retrieve search results by time range. For more information, see Searches by range.
Supported field types in a data source
Data source | Supported field type |
RDS | TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR |
PolarDB | TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR |
MaxCompute | BIGINT,DOUBLE,BOOLEAN,DATETIME,STRING,DECIMAL,MAP,ARRAY,TINYINT,SMALLINT,INT,FLOAT,CHAR,VARCHAR,DATE,TIMESTAMP,BINARY,INTERVAL_DAY_TIME,INTERVAL_YEAR_MONTH,STRUCT |
Mappings between field types of tables for Industry Algorithm Edition and those of tables in a database
Table for Industry Algorithm Edition | RDS table | PolarDB table | MaxCompute table |
INT | BIGINT,TINYINT,SMALLINT,INTEGER | BIGINT,TINYINT,SMALLINT,INTEGER | BIGINT,TINYINT,SMALLINT,INT |
INT_ARRAY | STRING types such as VARCHAR. This field type must be converted by using the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. |
FLOAT | FLOAT,NUMERIC,DECIMAL | FLOAT,NUMERIC,DECIMAL | FLOAT,DECIMAL |
FLOAT_ARRAY | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. |
DOUBLE | DOUBLE,NUMERIC,DECIMAL | DOUBLE,NUMERIC,DECIMAL | DOUBLE,DECIMAL |
DOUBLE_ARRAY | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. |
LITERAL | STRING types such as VARCHAR | STRING types such as VARCHAR | STRING types such as VARCHAR |
LITERAL_ARRAY | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. | STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter. |
SHORT_TEXT | STRING types such as VARCHAR | STRING types such as VARCHAR | STRING types such as VARCHAR |
TEXT | STRING types such as VARCHAR | STRING types such as VARCHAR | STRING types such as VARCHAR |
TIMESTAMP | DATETIME or TIMESTAMP | DATETIME or TIMESTAMP | DATETIME or TIMESTAMP |
GEO_POINT | STRING types such as VARCHAR | STRING types such as VARCHAR | STRING types such as VARCHAR in the format of "lon lat" lon indicates the longitude and lat indicates the latitude. Both longitude and latitude values must be of the DOUBLE type and be separated by a space. Range of the lon parameter: [-180, 180] and range of the lat parameter: [-90,90]. |
Note:
If fields in a data source are of the FLOAT or DOUBLE type, we recommend that you change the field types to DECIMAL. Otherwise, this may compromise the precision.
Create an application schema
Industry Algorithm Edition provides the following methods to create an application schema, namely, table schema for Industry Algorithm Edition:
Create an application schema by importing a table schema from an ApsaraDB RDS for MySQL, MaxCompute, or PolarDB data source.
Manually create an application schema. For more information, see Join multiple tables.
Join multiple tables
This section describes how to join multiple tables to manually create an application schema. In this example, the primary table main and the secondary table test_tb_1 are associated.
1. Log on to the OpenSearch console, find the desired application instance, and click Configure in the Actions column.
2. On the page that appears, specify the main table as the primary table and specify a field as the primary key.
3. Specify a field as the primary key for the secondary table.
4. In the External Tables column of the primary table, select the secondary table to join the primary table and the secondary table.
Note:
For more information about the association between primary and secondary tables supported by Industry Algorithm Edition, see .
Foreign key fields must be of the INT or LITERAL type.
When you join the primary table to one or more secondary tables, the fields that are used for the join operation must be of the same type in the primary and secondary tables. For example, if the field used for the join operation in the primary table is of the INT type, the fields used for the join operation in the secondary tables must also be of the INT type. If the field used for the join operation in the primary table is of the LITERAL type, the fields used for the join operation in the secondary tables must also be of the LITERAL type.
When you join a secondary table to the primary table, you must use the primary key of the secondary table to join the tables.