ApsaraDB for HBase is a highly optimized NoSQL database (enterprise edition is available as Lindorm) that is compatible with the community edition of HBase. Thanks to this compatibility and integration with Spark, Phoenix, and Solr, ApsaraDB for HBase is easy to use and offers superior stability and cost performance. ApsaraDB for HBase can easily support high throughput and high concurrency scenarios, providing support for real-time volumetric data storage, full-text indexing, lightweight SQL queries, as well as time space and series queries.
Compared with its open-source counterpart, ApsaraDB for HBase is better optimized, all the way down to the kernel, with superior read/write performance, disaster recovery capabilities, storage efficiency, and response latency. In terms of the numbers, read/write performance is marked by a 3 to 7 times improvement, RPO is reduced to less than 1 minute, 99% percentile latency has been reduced by 90%, MTTR is reduced by 90%, and the compression ratio has increased 13 fold. ApsaraDB for HBase guarantees a service uptime of 99.9%. ApsaraDB for HBase is suitable for high-demand industry applications like risk control, recommendation, advertising, IoT, VoT, feed streaming, and data visualization scenarios. Internally at Alibaba, ApsaraDB for HBase has already provided support for several of Alibaba Group’s core businesses, including Taobao, Alipay, and Cainiao.
Benefits
-
High Storage Reliability
Built with a distributed cluster architecture with six data backups and at least three replicas to guarantee a data reliability of 99.99999999%.
-
High Availability
Real-time availability monitoring and single-point failover are supported to guarantee the continuity of your workloads. Service process monitoring and automatic recovery can help you recover processes within a few seconds.
-
High-Efficiency Operations and Maintenance
ApsaraDB for HBase is equipped with a unified platform for visualized database management, monitoring, and alerting. Its kernel has been automatically upgraded for high-efficiency operations and maintenance, and the corresponding web console, API, and multi-language SDKs were designed for the easy cluster management.
-
Significantly Improved Performance
Developed as an improved version of the community edition of HBase. Alibaba Cloud has significantly optimized the kernel, allowing for read and write performance to be increased by 3 to 7 times, with 99% percentile latency reduced by 90%, and MTTR reduced by 90%.
Features
Optimized Kernel and Architecture
High availability architecture with unlimited cluster scaling and a deeply optimized kernel.
High-Availability Architecture
ApsaraDB for HBase adapts a high availability architecture where masters run as backups for each other. High reliability is guaranteed by real-time availability detection. Regions can be switched within a few seconds when a core node fails.
Reduced Storage Costs
High-compression ratios, cold storage, as well as hot and cold data separation are supported to reduce the storage costs by over 50%.
Cluster Scalability
Each core node can respond to up to 100 thousand queries per second and provide up to 8 TB of storage space. Disk and cluster sizes can be expanded as needed. A cluster can be scaled up to 1,000 nodes to handle 10 million queries per second and store petabytes of data.
Low Read/Write Latency
SSDs are used to support high-speed reads and writes. An individual data entry smaller than 0.2 KB can be read or written with a 99.9th percentile latency of 3 milliseconds and an average latency of 1 millisecond.
Data Backup
Cluster data backup and restore, and real-time incremental data backup are supported. The Recovery Time Objective (RTO) is reduced to less than 1 minute.
Dual-cluster Disaster Recovery
Primary and secondary clusters are used for automatic failover. Data between two clusters is synchronized in real time.
Support for Various Scenarios
SQL, time series, time space, and data retrieval are supported.
Support for SQL
The phoenix SQL component supports secondary indexes and standard SQL syntax.
Support for Solr
The built-in Solr component supports full-text indexes for complex searches. This component is provided for synchronizing data from ApsaraDB for HBase to Solr.
Native Secondary Indexes
The native secondary indexes can be read and written six times faster than phoenix indexes. No external component needs to be installed.
Support for Time Series
The OpenTSDB component supports time series data.
Support for Time Space
The GeoMesa component supports time space data.
Real-Time Medium-Sized Object (MOB) Storage
The real-time storage and access of objects smaller than 10 MB is supported.
Fully-Managed HBase Analytics Engine
The Analytics Engine is designed to meet user needs in data streaming and data analytics.
Support for Data Streaming
Support for ingesting data from Kafka, Log Service, and Message Queue allows for alerting and Extract, Transform, Load (ETL) processing.
Support for Various Data Sources
Support for various data sources, including HBase, Object Storage Service (OSS), RDS, and MongoDB allows for complex data analytics.
High Performance
Support for operator pushdown and column pruning.
Efficient Operations and Maintenance
A visualized and easy-to-use O&M platform is provided that automatically upgrades the system to the latest version.
Cluster Monitoring on Cloud
Cluster information is monitored in real time, so that you can obtain up-to-date cluster information. Monitored metrics include CPU utilization, IOPS, connections, and disk space. Alerts are sent when anomalies are detected.
Visualized Management Platform
A visualized management platform is provided so that you can easily scale out clusters, modify configurations, and restart clusters.
Database Kernel Version Management
Automatic upgrades are supported to fix vulnerabilities at the earliest time, eliminating the need to manage kernel versions manually. HBase settings are optimized to maximize the utilization of system resources.
Scenarios
Financial Risk Control
Both Structured and Unstructured Data Are Supported
ApsaraDB for HBase can help you aggregate and analyze transaction data, enterprise data, and data crawled from the Internet. You can then develop software as big data-based risk control services, such as anti-fraud and user profiling systems.
Benefits
-
Low Storage Costs
ApsaraDB for HBase uses sparse storage schemes to support petabytes of structured and unstructured data. You can search traction records for detailed information.
-
Hybrid Transaction/Analytical Processing (HTAP) for Volumetric Data
With ApsaraDB for HBase, you can use Phoenix and secondary indexes for real-time Online Transaction Processing (OLTP), and use Spark for Online Analytical Processing (OLAP).
-
High Concurrency for Fast Writes
Unlike traditional B+ trees, ApsaraDB for HBase supports Log-Structured Merge (LSM)-based storage. It is designed for fast writes based on high concurrency.
Volumetric Data Storage and Analytics
Support for Volumetric Data
ApsaraDB for HBase supports fast ingests of volumetric data and real-time synchronization of incremental data. It allows you to use Spark to analyze volumetric data offline.
Benefits
-
Low Storage Costs
Automatic hot and cold data separation is supported. Cold data is compressed by using the Zstandard and stored on Object Storage Service (OSS), reducing storage costs by more than 60%.
-
Data Streaming
Spark Streaming for real-time workloads is supported.
-
Fast Loads
You can use Bulk Load to quickly load hundreds of terabytes of data to ApsaraDB for HBase.
Recommender Engine
Optimized User Profile Storage Reduces Real-time Recommendation Latency to Milliseconds
User behavior data is stored in real time. With user profiles created based on user data, your real-time recommendation system can respond to queries in as quickly as a few milliseconds.
Benefits
-
Low Storage Costs
Uses sparse storage schemes, which are suitable for storing highly compressed user profile data.
-
High Scalability
Each node can respond to more than 100 thousand queries per second. Nodes can be scaled out as needed. No database or table sharding is needed.
-
Low Latency
Deeply optimized kernel: If your workloads are sensitive to the response latency, you can opt for SSDs to reduce the average read/write latency to less than 2 milliseconds. The 99.9 percentile latency can be reduced to less than 80 milliseconds.
IoT Time Series and Space
High-Performance Distributed TSDB
A built-in OpenTSDB component is provided to process time series data with high efficiency at low costs to you, making ApsaraDB for HBase suitable for IoT, monitoring, and K-line charts scenarios.
Benefits
-
Native OpenTSDB for Processing Floating Point Data at Low Costs
The native OpenTSDB is based on the HBase distributed architecture and is compatible with open-source database engines, allowing you to easily scale time series databases.
-
Support for Petabytes of Time Space Data and High Concurrency for Fast Writes
Storage and computing are decoupled, and high concurrency is supported for fast writes. Each node can respond to hundreds of thousands of queries per second.
-
Time-Space Indexes and Algorithms
The analytics engine is based on space-filling curves Z-Order and Hilbert, and supports 2D and 3D time-space indexes.
Social Feeds
Stores Volumetric Social Feeds
You can use ApsaraDB for HBase to store volumetric social feeds such as posts, articles, chat records, and comments.
Benefits
-
High Concurrency for Fast Writes
Unlike traditional B+ trees, ApsaraDB for HBase supports Log-Structured Merge (LSM)-based storage. It is designed for fast writes based on high concurrency.
-
Cost Effectiveness
Hot and cold data separation is supported. Cold data is automatically stored on OSS, reducing storage costs by over 60%.
-
Low Read/Write Latency
Deeply optimized kernel. If your workloads are sensitive to the response latency, you can use SSDs to reduce the average read/write latency to less than 2 milliseconds. The P999 latency can be reduced to less than 80 milliseconds.
Fast MOB Storage
Extremely Fast MOB Storage and Search
ApsaraDB for HBase uses the medium-sized object (MOB) technology. It supports objects sized between 1 KB to 10 MB, such as charts, short video files, and documents. For any request that relates to multiple queries, system latency can be as low as a few milliseconds.
Benefits
-
High Performance
The MOB technology can help you reduce system latency from 10 seconds to 20 milliseconds, making for a 500% increase in performance.
-
Powerful Search Capabilities
Powerful search capabilities are built in and come with filters and Phoenix indexes.
-
High Compression Ratio and Low Costs
Multiple compression algorithms allow for compression ratios from 3:1 to 10:1.
Volumetric Data Fuzzy Match
Fuzzy Match and Exact Match for Searching Large Amounts of Records
The built-in Solr component supports automatic full-text and index synchronization. Both exact match and fuzzy match are supported.
Game Log Data Processing
Log Data Collection, Store, and Analytics
You can use ApsaraDB for HBase to collect, store, and search user behavior log data on game servers, and to verify raw data, check player account top-up records, and analyze data in real time. Next, through offline log data processing and analytics, you can also calculate the customer retention rate, loan-to-value (LTV) ratio, average revenue per user (ARPU), and total top-up amount.
Benefits
-
Online and Offline Data Processing
ApsaraDB for HBase supports high-throughput and low-latency online data writes and reads. You can also use Spark to process and analyze ApsaraDB for HBase data offline.
-
Structured and Unstructured Data Schemas
Both structured and unstructured data schemas are supported, so that you can dynamically add columns to meet the requirements of game operations and maintenance.
-
Cost Effectiveness
ApsaraDB for HBase supports hot and cold data separation, which allows you to store cold data at low costs.
Video Data Storage
RealTtime Storage and Analytics for the Videos of Courses, Lectures and Surveillance
ApsaraDB for HBase supports high-throughput reads and writes of video data with low latency and costs. It also supports converged storage of video indexes, characteristics, and video source data.
Real-time Data Visualization
Data Visualization in Real Time
AparaDB for HBase provides an all-in-one solution for dynamic data analytics that is cost effective and offers millisecond latency. This solution also includes Spark real-time computing. This solution was employed in several of Alibaba’s e-commerce platforms for the Double 11 shopping festival, the world’s largest online shopping event, for the real-time aggreation of page views, unique visitors, transaction counts, and commodity ranking information on large screens.
Upgraded Support For You
1 on 1 Presale Consultation, 24/7 Technical Support, Faster Response, and More Free Tickets.