All Products
Search
Document Center

Tablestore:Use MaxCompute to access Tablestore

Last Updated:Mar 22, 2024

This topic describes how to establish a seamless connection between Tablestore and MaxCompute that belong to the same Alibaba Cloud account.

Background information

MaxCompute is a cost-effective and fully managed platform for the warehousing of petabytes of data. You can use MaxCompute to process and analyze large amounts of data in a fast and efficient manner. You can execute a simple Data Definition Language (DDL) statement to create an external table in MaxCompute to associate the external table with an external data source. This allows you to access and export data in various formats. MaxCompute tables can contain only structured data. However, external tables can be used to process structured and non-structured data.

Tablestore and MaxCompute support different types of data. The following table describes the mappings between the data types supported in Tablestore and MaxCompute.

Tablestore

MaxCompute

STRING

STRING

INTEGER

BIGINT

DOUBLE

DOUBLE

BOOLEAN

BOOLEAN

BINARY

BINARY

Preparations

Before you use MaxCompute to access Tablestore, make sure that the following preparations are complete:

  1. MaxCompute is activated.

  2. A workspace is created.

  3. An AccessKey pair is created.

  4. MaxCompute is authorized to access Tablestore in the Resource Access Management (RAM) console. For more information, see Cross-account authorization.

  5. An instance and a table are created in the Tablestore console. For more information, see Create instances and Create tables.

    The following items describe the information about the created instance and table in this example:

    • Instance name: cap1

    • Data table name: vehicle_track

    • Primary key information: vid(integer) and gt(integer)

    • Endpoint: https://cap1.cn-hangzhou.ots-internal.aliyuncs.com

      Note

      When you use MaxCompute to access Tablestore, we recommend that you use the private endpoint of Tablestore.

    • The network access control settings for the instance. By default, you can access a Tablestore instance from the Tablestore console and over VPCs, the classic network, and the Internet. In this example, the default settings in the Network Access Control section are used. The following figure shows the default settings in the Network Access Control section.

      image

Step 1: Install and configure the client

  1. Download and decompress the MaxCompute client installation package.

    Note

    Make sure that JRE 1.7 or later is installed on your machine.

  2. Modify the conf/odps_config.ini file to configure the client. The following sample code provides an example on how to configure the client:

    # Specify the AccessKey pair of your Alibaba Cloud account. An AccessKey pair consists of an AccessKey ID and an AccessKey secret. To obtain the AccessKey pair of your Alibaba Cloud account, log on to the Alibaba Cloud official website, go to the Alibaba Cloud Management Console, move the pointer over the profile picture in the upper-right corner, and then click AccessKey Management. 
    access_id=*******************
    access_key=*********************
    # Specify the name of the project that you want to access. 
    project_name=my_project
    # Specify the endpoint of MaxCompute. 
    end_point=https://service.odps.aliyun.com/api
    # Specify the link to access the MaxCompute Tunnel service. 
    tunnel_endpoint=https://dt.odps.aliyun.com
    # Specify the LogView URL that the client returns after a job is run. After you access the LogView URL, you can view detailed information about the job. 
    log_view_host=http://logview.odps.aliyun.com
    # Specify whether to enable access over HTTPS. 
    https_check=true
    Note

    In the odps_config.ini file, a number sign (#) is used to add comments. The MaxCompute client uses -- to add comments.

  3. Run bin/odpscmd.bat. Enter show tables;.

    If the tables in the current MaxCompute project are displayed, the preceding configurations are valid.

    odps2

Step 2: Create an external table

Create a MaxCompute data table named ots_vehicle_track to associate the table with a Tablestore table named vehicle_track.

The following items describe information about the associated Tablestore table:

  • Instance name: cap1

  • Data table name: vehicle_track

  • Primary key information: vid(int) and gt(int)

  • Endpoint: https://cap1.cn-hangzhou.ots-internal.aliyuncs.com

CREATE EXTERNAL TABLE IF NOT EXISTS ots_vehicle_track
(
vid bigint,
gt bigint,
longitude double,
latitude double,
distance double ,
speed double,
oil_consumption double
)
STORED BY 'com.aliyun.odps.TableStoreStorageHandler' -- (1)
WITH SERDEPROPERTIES ( -- (2)
'tablestore.columns.mapping'=':vid, :gt, longitude, latitude, distance, speed, oil_consumption', -- (3)
'tablestore.table.name'='vehicle_track' -- (4)
)
LOCATION 'tablestore://cap1.cn-hangzhou.ots-internal.aliyuncs.com'; -- (5)

The following table describes the parameters.

Sequence number

Parameter

Description

(1)

com.aliyun.odps.TableStoreStorageHandler

The MaxCompute built-in storage handler that processes Tablestore data and defines the interaction between MaxCompute and Tablestore. MaxCompute implements the related logic.

(2)

SERDEPROPERITES

The operation that provides parameter options. When you use the storage handler that processes Tablestore data, you must configure the tablestore.columns.mapping and tablestore.table.name parameters.

(3)

tablestore.columns.mapping

Required. The columns of the Tablestore table that you want MaxCompute to access. The columns include primary key columns and attribute columns. If a colon (:) is included in the name of a column, the column is a primary key column of the Tablestore table. In this example, :vid and :gt are primary key columns of the Tablestore table. Other columns are attribute columns. You must specify all primary key columns of the Tablestore table when you specify the mapping. You need to specify only the attribute columns that you want MaxCompute to access. You do not need to specify all attribute columns.

(4)

tablestore.table.name

The name of the Tablestore table that you want MaxCompute to access. If the specified name of the Tablestore table does not exist, an error occurs. In this case, MaxCompute does not create the Tablestore table.

(5)

LOCATION

The information about the Tablestore instance to which the Tablestore table belongs. The information includes the name of the instance and the endpoint.

Step 3: Use the external table to access Tablestore data

After you create an external table, Tablestore data is imported to MaxCompute. Then, you can use MaxCompute SQL commands to access Tablestore data.

-- Collect statistics about the average speed and fuel consumption of vehicles whose VIDs are less than 4 before the timestamp 1469171387. 
select vid,count(*),avg(speed),avg(oil_consumption) from ots_vehicle_track where vid <4 and gt<1469171387  group by vid;

Sample command output:

odps1