Data storage
Category |
Feature |
Description |
References |
Metadata storage |
Wide Column model |
The Wide Column model is similar to the data model of Bigtable or HBase and is suitable for various scenarios such as metadata and big data. The Wide Column model stores data in data tables. A single data table supports petabyte-level data storage and tens of millions of queries per second (QPS). The data tables are schema-free and support wide columns, multi-version data, and time-to-live (TTL) management. The data tables also support auto-increment primary key columns, local transactions, atomic counters, filters, and conditional updates. |
|
Search index |
Search indexes are used for multi-dimensional data queries and statistical analysis in big data scenarios based on inverted indexes and column stores. If your business requires multi-dimensional queries and data analysis, you can create a search index and specify the required attributes as the fields of the search index. Then, you can query and analyze data by using the search index. For example, you can use a search index to perform queries based on non-primary key columns, Boolean queries, and fuzzy queries. You can also use a search index to obtain maximum and minimum values, collect statistics about the number of rows, and group query results. |
||
Timeline model |
The Timeline model is designed to store message data. This model can meet the requirements of messaging processes, such as message order preservation, storage of large numbers of messages, and real-time synchronization. This model also supports full-text search and Boolean query. This model is suitable for storing message data that is generated from instant messaging (IM) applications and feed streams. |
||
IoT storage |
TimeSeries model |
The TimeSeries model is designed based on the characteristics of time series data. This model is suitable for scenarios such as IoT device monitoring and can be used to store data collected by devices and the monitoring data of machines. The TimeSeries model supports automatic indexing of time series metadata and time series retrieval based on composite conditions. The TimeSeries model uses time series tables to store time series data. This allows applications to write and read petabytes of data concurrently and reduces the storage costs. You can execute SQL statements to query and analyze time series data. |
|
SQL query |
Data mapping |
Before you use SQL, make sure that the field data types in SQL match the field data types in data tables. |
Data type mappings |
DDL |
Tablestore supports Data Definition Language (DDL) operations including creating mapping tables for tables, creating mapping tables for search indexes, updating attribute columns of mapping tables, deleting mapping tables, and querying information about tables. |
DDL statements | |
DQL |
Tablestore supports Data Query Language (DQL) operations including data query, data aggregation, full-text search, ARRAY field in search indexes, nested query, virtual columns of search indexes, and join operations. |
DQL statements | |
Database Administration |
Tablestore supports database administration operations, including querying the index information about a table and listing table names. |
Database Administration | |
Query optimization |
You can use index selection policies and computing pushdown to accelerate SQL queries. You can perform index-based queries by using explicit access to a secondary index table. Tablestore provides the following methods to query data by using a search index: automatic selection of a search index and explicit access to a search index. Search indexes provide features such as conditional filtering, aggregation, and sorting. After you create a search index, the system makes full use of computing capabilities of the search index to push down some SQL computing tasks to the search index. This prevents full table scans and improves computing efficiency. |
Optimized queries |
Data management
Category |
Feature |
Description |
References |
Data management |
Data version |
Max versions specifies the maximum number of versions that can be retained for the data in an attribute column. If the number of versions of data in attribute columns exceeds the value of this parameter, the system automatically deletes the data of earlier versions in asynchronous mode. After you configure max versions for a data table, Tablestore generates a new version for a value each time you update a value in an attribute column. The version number is the timestamp in milliseconds. |
Data versions and TTL |
Lifecycle management |
TTL specifies the validity period of data in a data table in seconds. When data of a specific version in an attribute column is retained for a period of time that exceeds the TTL value, Tablestore automatically deletes the data of the specific version in asynchronous mode from the attribute column. If data in all attribute columns of a row is retained for a period of time that exceeds the TTL value, Tablestore automatically deletes the row in asynchronous mode. |
TTL |
Security compliance
Category |
Feature |
Description |
References |
Access control |
Identity management |
To ensure the security of your Alibaba Cloud account and cloud resources, we recommend that you do not use your Alibaba Cloud account to access Tablestore unless necessary. We recommend that you use a RAM user or RAM role to access Tablestore. |
- |
STS management |
Compared with the long-term permission management mechanism provided by RAM, Security Token Service (STS) provides temporary access authorization by using a temporary AccessKey pair and token to allow temporary access to Tablestore. STS grants strict access permissions that remain valid within a limited period of time. This way, your system is not severely affected even if access credentials are leaked. |
RAM and STS | |
RAM Policy |
The RAM Policy feature provided by RAM allows you to manage your users such as employees, systems, and applications in a centralized manner and control their access to cloud resources. |
RAM Policy | |
Control Policy |
The Control Policy feature of the Resource Directory service of Resource Management allows you to manage the permission boundaries of the folders or members in a resource directory in a centralized manner. If you want to manage the permissions of enterprise members in a centralized manner, you can enable the Control Policy feature. After you enable the Control Policy feature, you can configure custom access control policies to define the permission boundaries of enterprise members in a resource directory. |
Control Policy | |
Network ACL |
The Network ACL feature provided by Tablestore allows you to restrict the types of networks from which users can access a Tablestore instance. |
- |
|
Instance Policy |
The Instance Policy feature provided by Tablestore allows you to restrict the access sources of a Tablestore instance. |
- |
|
Data security |
Server-side encryption |
Tablestore provides the table encryption feature to encrypt a table when the table is saved to a disk. This ensures the security of table data. When you create a data table, you can configure encryption for the data table. |
Data encryption |
Transmission encryption |
Tablestore supports Transport Layer Security (TLS) encryption for data transmission between servers and clients. Tablestore allows you to use custom RAM policies and control policies to restrict the TLS versions that users can use to access Tablestore. Encryption algorithms of higher TLS versions are more secure. We recommend that you use TLS 1.2 or later. |
||
Security compliance |
Audit log |
Tablestore integrates Simple Log Service to provide log query, analysis, and clustering and statistics charts. You can log modifications to Tablestore instances, such as creating a data table, time series table, and index. You can use the audit log feature in scenarios such as security audits, compliance audits, and troubleshooting. |
Audit log |
Disaster recovery and backup
Category |
Feature |
Description |
References |
Backup |
Backup and restoration |
To prevent operations such as accidental deletion and malicious tampering from affecting the availability of important data, you can use the data backup feature to back up table data in an instance. |
Backup and restoration |
High availability |
ZRS |
Tablestore provides the zone-redundant storage (ZRS) feature for disaster recovery of instance data in data centers. You can create instances for which the ZRS feature is enabled. This way, you can ensure strong consistency even if a data center becomes unavailable due to network interruptions, power outages, or disaster events. The ZRS feature ensures high data availability and disaster recovery. |
ZRS |
Product ecosystem
Category |
Feature |
Description |
References |
Data visualization |
DataV |
You can use DataV to display data in the data tables or secondary indexes of Tablestore. In most cases, DataV is used to build enterprise application systems for complex big data processing and analytics. |
Connect Tablestore to DataV |
Grafana |
You can use Grafana to display data in the data tables or time series tables of Tablestore. |
Connect Tablestore to Grafana | |
Ecosystem integration |
MaxCompute |
You can establish a seamless connection between Tablestore and MaxCompute within the same Alibaba Cloud account. If data stored in Tablestore is uniquely structured and requires customized processing, such as parsing a specific JSON string, you can use User Defined Function (UDF) to define development logic. |
MaxCompute |
Spark |
You can use Spark to perform complex computing and analysis on Tablestore data that is accessed by using E-MapReduce (EMR) SQL or DataFrame. |
Spark/SparkSQL | |
Hive or Hadoop MapReduce |
You can use Hive or Hadoop MapReduce to access a Tablestore table. |
Hive/HadoopMR | |
Function Compute |
You can use Function Compute to perform real-time computing on incremental data in Tablestore. |
Function Compute | |
Flink |
You can use Realtime Compute for Apache Flink to access source tables, dimension tables, or result tables in Tablestore to compute or analyze the data of your big data applications. Data tables can be used as source tables, dimension tables, or result tables. Time series tables can be used only as result tables. |
Flink | |
Presto |
After you connect PrestoDB to Tablestore, you can execute SQL statements in PrestoDB to query and analyze data in Tablestore, write data to Tablestore, and import data to Tablestore. |
Presto | |
Data migration and synchronization |
Synchronize Kafka data to Tablestore |
You can use Tablestore Sink Connector to batch import data in Apache Kafka to a data table or time series table in Tablestore. |
Synchronize Kafka data to Tablestore |
Synchronize data from Tablestore to MaxCompute |
You can use the data integration feature of DataWorks to synchronize incremental and full data from Tablestore to MaxCompute. |
Synchronize data from tablestore to MaxCompute | |
Synchronize data from Tablestore to OSS |
You can use the data integration feature of DataWorks to synchronize incremental and full data from Tablestore to Object Storage Service (OSS). |
Synchronize data from Tablestore to OSS | |
Download data in Tablestore to a local file |
Tablestore allows you to download data to a local file by using the Tablestore CLI or DataX. You can also use DataWorks to synchronize data from Tablestore to OSS, and then download the data from OSS to a local file. |
Download data in Tablestore to a local file | |
Data delivery |
Data delivery |
Tablestore uses data delivery to deliver full or incremental data to OSS in real time. This feature is suitable for building data lakes as it enables Tablestore to store historical data in OSS at lower costs while Tablestore implements offline or quasi-real-time analysis of larger amounts of data. |
Overview |
Data lake-based computing and analysis |
After you synchronize data from Tablestore to OSS, you can use EMR JindoFS in cache mode to connect to OSS. |
Use EMR | |
Tunnel Service |
Tunnel Service |
Tunnel Service is a centralized service that uses the Tablestore API to allow you to consume full and incremental data. Tunnel Service provides tunnels that are used to export and consume full, incremental, and differential data. After you create a tunnel, you can use the tunnel to consume full and incremental data that is exported from a specific table. |
Tunnel Service |
Basic capabilities
Category |
Feature |
Description |
References |
Terms |
region |
A region is the physical location of an Alibaba Cloud data center. Tablestore is deployed across multiple Alibaba Cloud regions. You can select a region based on your business requirements. |
Region |
instance |
An instance is a logical entity used in Tablestore to manage tables. Each instance is equivalent to a database. Tablestore implements application access control and resource measurement at the instance level. |
Instance | |
endpoint |
Each Tablestore instance has an endpoint. You must specify an endpoint before you can perform operations on data and tables in Tablestore. |
Service address | |
Usage methods |
console |
You can perform operations on Wide Column and TimeSeries instances, tables, and data, and SQL queries by using the Tablestore console. |
- |
SDK |
Tablestore provides SDKs for mainstream programming languages such as Java, Go, Python, Node.js, .NET and PHP. |
- |
|
Tablestore CLI |
The Tablestore CLI provides simple and clear commands that you can run in Windows, Linux, and macOS. The Tablestore CLI allows you to perform operations on instances, data tables in the Wide Column model, data, secondary indexes, search indexes, time series tables, and Tunnel Service, and query data by executing SQL statements. |
- |
|
SQL |
You can use the SQL query feature to perform complex queries and analytics on data in Tablestore in an efficient manner. The SQL query feature provides a unified access interface for multiple data engines. |
- |