All Products
Search
Document Center

Tablestore:Functions and features

Last Updated:Apr 28, 2024

Data storage

Category

Feature

Description

References

Metadata storage

Wide Column model

The Wide Column model is similar to the data model of Bigtable or HBase and is suitable for various scenarios such as metadata and big data. The Wide Column model stores data in data tables. A single data table supports petabyte-level data storage and tens of millions of queries per second (QPS). The data tables are schema-free and support wide columns, multi-version data, and time-to-live (TTL) management. The data tables also support auto-increment primary key columns, local transactions, atomic counters, filters, and conditional updates.

Search index

Search indexes are used for multi-dimensional data queries and statistical analysis in big data scenarios based on inverted indexes and column stores. If your business requires multi-dimensional queries and data analysis, you can create a search index and specify the required attributes as the fields of the search index. Then, you can query and analyze data by using the search index. For example, you can use a search index to perform queries based on non-primary key columns, Boolean queries, and fuzzy queries. You can also use a search index to obtain maximum and minimum values, collect statistics about the number of rows, and group query results.

Timeline model

The Timeline model is designed to store message data. This model can meet the requirements of messaging processes, such as message order preservation, storage of large numbers of messages, and real-time synchronization. This model also supports full-text search and Boolean query. This model is suitable for storing message data that is generated from instant messaging (IM) applications and feed streams.

IoT storage

TimeSeries model

The TimeSeries model is designed based on the characteristics of time series data. This model is suitable for scenarios such as IoT device monitoring and can be used to store data collected by devices and the monitoring data of machines. The TimeSeries model supports automatic indexing of time series metadata and time series retrieval based on composite conditions. The TimeSeries model uses time series tables to store time series data. This allows applications to write and read petabytes of data concurrently and reduces the storage costs. You can execute SQL statements to query and analyze time series data.

SQL query

Data mapping

Before you use SQL, make sure that the field data types in SQL match the field data types in data tables.

Data type mappings

DDL

Tablestore supports Data Definition Language (DDL) operations including creating mapping tables for tables, creating mapping tables for search indexes, updating attribute columns of mapping tables, deleting mapping tables, and querying information about tables.

DDL statements

DQL

Tablestore supports Data Query Language (DQL) operations including data query, data aggregation, full-text search, ARRAY field in search indexes, nested query, virtual columns of search indexes, and join operations.

DQL statements

Database Administration

Tablestore supports database administration operations, including querying the index information about a table and listing table names.

Database Administration

Query optimization

You can use index selection policies and computing pushdown to accelerate SQL queries. You can perform index-based queries by using explicit access to a secondary index table. Tablestore provides the following methods to query data by using a search index: automatic selection of a search index and explicit access to a search index. Search indexes provide features such as conditional filtering, aggregation, and sorting. After you create a search index, the system makes full use of computing capabilities of the search index to push down some SQL computing tasks to the search index. This prevents full table scans and improves computing efficiency.

Optimized queries

Data management

Category

Feature

Description

References

Data management

Data version

Max versions specifies the maximum number of versions that can be retained for the data in an attribute column. If the number of versions of data in attribute columns exceeds the value of this parameter, the system automatically deletes the data of earlier versions in asynchronous mode. After you configure max versions for a data table, Tablestore generates a new version for a value each time you update a value in an attribute column. The version number is the timestamp in milliseconds.

Data versions and TTL

Lifecycle management

TTL specifies the validity period of data in a data table in seconds. When data of a specific version in an attribute column is retained for a period of time that exceeds the TTL value, Tablestore automatically deletes the data of the specific version in asynchronous mode from the attribute column. If data in all attribute columns of a row is retained for a period of time that exceeds the TTL value, Tablestore automatically deletes the row in asynchronous mode.

TTL

Security compliance

Category

Feature

Description

References

Access control

Identity management

To ensure the security of your Alibaba Cloud account and cloud resources, we recommend that you do not use your Alibaba Cloud account to access Tablestore unless necessary. We recommend that you use a RAM user or RAM role to access Tablestore.

-

STS management

Compared with the long-term permission management mechanism provided by RAM, Security Token Service (STS) provides temporary access authorization by using a temporary AccessKey pair and token to allow temporary access to Tablestore. STS grants strict access permissions that remain valid within a limited period of time. This way, your system is not severely affected even if access credentials are leaked.

RAM and STS

RAM Policy

The RAM Policy feature provided by RAM allows you to manage your users such as employees, systems, and applications in a centralized manner and control their access to cloud resources.

RAM Policy

Control Policy

The Control Policy feature of the Resource Directory service of Resource Management allows you to manage the permission boundaries of the folders or members in a resource directory in a centralized manner. If you want to manage the permissions of enterprise members in a centralized manner, you can enable the Control Policy feature. After you enable the Control Policy feature, you can configure custom access control policies to define the permission boundaries of enterprise members in a resource directory.

Control Policy

Network ACL

The Network ACL feature provided by Tablestore allows you to restrict the types of networks from which users can access a Tablestore instance.

-

Instance Policy

The Instance Policy feature provided by Tablestore allows you to restrict the access sources of a Tablestore instance.

-

Data security

Server-side encryption

Tablestore provides the table encryption feature to encrypt a table when the table is saved to a disk. This ensures the security of table data. When you create a data table, you can configure encryption for the data table.

Data encryption

Transmission encryption

Tablestore supports Transport Layer Security (TLS) encryption for data transmission between servers and clients. Tablestore allows you to use custom RAM policies and control policies to restrict the TLS versions that users can use to access Tablestore. Encryption algorithms of higher TLS versions are more secure. We recommend that you use TLS 1.2 or later.

Security compliance

Audit log

Tablestore integrates Simple Log Service to provide log query, analysis, and clustering and statistics charts. You can log modifications to Tablestore instances, such as creating a data table, time series table, and index. You can use the audit log feature in scenarios such as security audits, compliance audits, and troubleshooting.

Audit log

Disaster recovery and backup

Category

Feature

Description

References

Backup

Backup and restoration

To prevent operations such as accidental deletion and malicious tampering from affecting the availability of important data, you can use the data backup feature to back up table data in an instance.

Backup and restoration

High availability

ZRS

Tablestore provides the zone-redundant storage (ZRS) feature for disaster recovery of instance data in data centers. You can create instances for which the ZRS feature is enabled. This way, you can ensure strong consistency even if a data center becomes unavailable due to network interruptions, power outages, or disaster events. The ZRS feature ensures high data availability and disaster recovery.

ZRS

Product ecosystem

Category

Feature

Description

References

Data visualization

DataV

You can use DataV to display data in the data tables or secondary indexes of Tablestore. In most cases, DataV is used to build enterprise application systems for complex big data processing and analytics.

Connect Tablestore to DataV

Grafana

You can use Grafana to display data in the data tables or time series tables of Tablestore.

Connect Tablestore to Grafana

Ecosystem integration

MaxCompute

You can establish a seamless connection between Tablestore and MaxCompute within the same Alibaba Cloud account. If data stored in Tablestore is uniquely structured and requires customized processing, such as parsing a specific JSON string, you can use User Defined Function (UDF) to define development logic.

MaxCompute

Spark

You can use Spark to perform complex computing and analysis on Tablestore data that is accessed by using E-MapReduce (EMR) SQL or DataFrame.

Spark/SparkSQL

Hive or Hadoop MapReduce

You can use Hive or Hadoop MapReduce to access a Tablestore table.

Hive/HadoopMR

Function Compute

You can use Function Compute to perform real-time computing on incremental data in Tablestore.

Function Compute

Flink

You can use Realtime Compute for Apache Flink to access source tables, dimension tables, or result tables in Tablestore to compute or analyze the data of your big data applications. Data tables can be used as source tables, dimension tables, or result tables. Time series tables can be used only as result tables.

Flink

Presto

After you connect PrestoDB to Tablestore, you can execute SQL statements in PrestoDB to query and analyze data in Tablestore, write data to Tablestore, and import data to Tablestore.

Presto

Data migration and synchronization

Synchronize Kafka data to Tablestore

You can use Tablestore Sink Connector to batch import data in Apache Kafka to a data table or time series table in Tablestore.

Synchronize Kafka data to Tablestore

Synchronize data from Tablestore to MaxCompute

You can use the data integration feature of DataWorks to synchronize incremental and full data from Tablestore to MaxCompute.

Synchronize data from tablestore to MaxCompute

Synchronize data from Tablestore to OSS

You can use the data integration feature of DataWorks to synchronize incremental and full data from Tablestore to Object Storage Service (OSS).

Synchronize data from Tablestore to OSS

Download data in Tablestore to a local file

Tablestore allows you to download data to a local file by using the Tablestore CLI or DataX. You can also use DataWorks to synchronize data from Tablestore to OSS, and then download the data from OSS to a local file.

Download data in Tablestore to a local file

Data delivery

Data delivery

Tablestore uses data delivery to deliver full or incremental data to OSS in real time. This feature is suitable for building data lakes as it enables Tablestore to store historical data in OSS at lower costs while Tablestore implements offline or quasi-real-time analysis of larger amounts of data.

Overview

Data lake-based computing and analysis

After you synchronize data from Tablestore to OSS, you can use EMR JindoFS in cache mode to connect to OSS.

Use EMR

Tunnel Service

Tunnel Service

Tunnel Service is a centralized service that uses the Tablestore API to allow you to consume full and incremental data. Tunnel Service provides tunnels that are used to export and consume full, incremental, and differential data. After you create a tunnel, you can use the tunnel to consume full and incremental data that is exported from a specific table.

Tunnel Service

Basic capabilities

Category

Feature

Description

References

Terms

region

A region is the physical location of an Alibaba Cloud data center. Tablestore is deployed across multiple Alibaba Cloud regions. You can select a region based on your business requirements.

Region

instance

An instance is a logical entity used in Tablestore to manage tables. Each instance is equivalent to a database. Tablestore implements application access control and resource measurement at the instance level.

Instance

endpoint

Each Tablestore instance has an endpoint. You must specify an endpoint before you can perform operations on data and tables in Tablestore.

Service address

Usage methods

console

You can perform operations on Wide Column and TimeSeries instances, tables, and data, and SQL queries by using the Tablestore console.

-

SDK

Tablestore provides SDKs for mainstream programming languages such as Java, Go, Python, Node.js, .NET and PHP.

-

Tablestore CLI

The Tablestore CLI provides simple and clear commands that you can run in Windows, Linux, and macOS. The Tablestore CLI allows you to perform operations on instances, data tables in the Wide Column model, data, secondary indexes, search indexes, time series tables, and Tunnel Service, and query data by executing SQL statements.

-

SQL

You can use the SQL query feature to perform complex queries and analytics on data in Tablestore in an efficient manner. The SQL query feature provides a unified access interface for multiple data engines.

-