All Products
Search
Document Center

OpenSearch:Build an application that supports online multi-table joins

Last Updated:Nov 05, 2024

This topic describes how to build an application that supports online multi-table joins.

Prerequisites

  1. An Alibaba Cloud account is created and the real-name verification is complete.

  2. An AccessKey pair is created before you log on to the OpenSearch console by using a new Alibaba Cloud account for the first time.

  • You must specify an AccessKey pair within your Alibaba Cloud account. The AccessKey pair is required to create and use an OpenSearch application.

  • After you create an AccessKey pair within your Alibaba Cloud account, you can create an AccessKey pair for a RAM user. This way, you can access an application as the RAM user. For more information about how to grant permissions to RAM users, see Access authorization rules.

Procedure

To build an application that supports online multi-table joins, perform the following steps:

  1. Purchase and create an application instance.

  2. Configure the application instance. You need to define application schemas and index schemas, and configure data sources.

  3. Run a search test.

Purchase and create an application instance

  1. Log on to the OpenSearch console. In the upper-left corner, move the pointer over the place where the OpenSearch edition is displayed and select OpenSearch High-performance Search Edition. In the left-side navigation pane, click Instance Management. On the page that appears, click Create Instance.

image

  1. On the buy page, configure the Commodity Edition, Product Type, Region and Zone, Application Name, Cluster Preferences, Storage Capacity, and Computing Resources parameters. Then, click Buy Now.

image

Commodity Edition: Select High-performance Search Edition.

image

  • Product Type: Select Subscription or Pay-as-you-go.

  • Region and zone: Select the region and zone based on your business requirements.

  • Application Name: Enter a custom name.

  • Cluster Preferences: Select Dedicated Cluster - Storage-optimized.

  • Storage Capacity and Computing Resources: Use the default values or configure the parameters based on your business requirements.

  1. Confirm the order, select the check box for the Terms of Service, and then click Pay.

image

  1. After the application instance is purchased, view the application instance on the Instance Management page in the OpenSearch console. The instance is in the Pending state.

image

Configure the application instance

  1. On the Instance Management page of the OpenSearch console, find the instance that you want to configure and click Configure in the Actions column.

image

  1. Define an application schema. You can define an application schema by using one of the following methods:

  • Use a template to define an application schema. You can save an application schema that you created as a template. Then, you can use the template to define a new application schema.

  • Use a data file to define an application schema. You can upload a data file to the OpenSearch console. Then, OpenSearch parses the uploaded data file and generates an initial application schema. The data file must be in the JSON format. After the initial application schema is generated, you must configure specific attributes such as field types.

  • Use a data source to define an application schema. You can use this method if you want to synchronize data from data sources such as ApsaraDB RDS and MaxCompute data sources. The schema of a source table can be used to generate an initial application schema. This reduces workloads on the manual definition and decreases the error probability. For more information, see Data sources.

  • Manually define an application schema. You can use this method to perform a quick test. In this example, two tables are added and application schemas are manually defined.

image

For more information about how to select data types for fields, see Application schema in OpenSearch High-performance Search Edition.

Note

You can add up to eight tables.

  1. Define an index schema separately for each table.

image

  • For more information about how to specify index fields, see Index schema.

  • For more information about how to specify analysis methods, see Text analyzers.

  • If you need to use a field in SELECT, WHERE, or ORDER BY clauses, specify the field as an attribute field.

Note
  • Fields of the FLOAT, FLOAT_ARRAY, DOUBLE, or DOUBLE_ARRAY type cannot be specified as index fields.

  • Fields of the TEXT or SHORT_TEXT type cannot be specified as attribute fields.

  1. Specify the routing field.

image

An OpenSearch application instance uses distributed backend storage. In multi-table join scenarios, the data to be joined must be on the same server. The following figure shows how it works.

When the engine builds an index, the engine calculates the hash values of records based on the specified routing field. Records with the same hash value are routed to the same column. Data in each column is joined based on the SQL statement that is sent by a Query Result Searcher (QRS) node, and data is not joined across columns. Then, each column returns the join results to the QRS node. Finally, the QRS node aggregates the recall results and returns the aggregated results to the user.

Note
  • By default, the primary key field is used as the routing field.

  • You can set only one field as the routing field.

  • The values of the routing field must be globally unique.

  • The routing field can be of the INT or LITERAL type.

  • If you need to join tables based on a non-primary key field, you must specify the field as the routing field.

  1. Configure data sources of the tables. You can configure the data sources based on the data source types supported by OpenSearch High-performance Search Edition.

image

Click Add Data Source and configure the data source.

image

  1. Click Completed.

image

  1. On the details page of the instance, check the status of the instance on the Offline Application tab and wait until the status of the application changes to Normal.

image

Run a search test

  1. After the offline application is in the Normal state and goes online, you can run a search test on the Search Test page.

image

  1. An application for online multi-table joins supports only SQL queries.

image

For more information about SQL syntax, see SQL features.

Usage notes

  • Only the instances whose cluster preference is Dedicated Cluster - Storage-optimized support the online multi-table joins.

  • In multi-table join scenarios in which the JOIN operation is performed on multiple tables, only SQL queries are supported.

  • In multi-table join scenarios, custom analyzers are not supported.

  • In multi-table join scenarios, sorting configurations are not supported.

  • In multi-table join scenarios, search result display is not supported.

  • Instances whose cluster preference is Dedicated Cluster - Storage-optimized cannot be changed to other specifications.