AHBench is a benchmark toolkit developed by the Lindorm team. You can use AHBench to benchmark Lindorm clusters or ApsaraDB for HBase clusters with a few clicks.
Introduction
The AHBench toolkit includes the Yahoo! Cloud Serving Benchmark (YCSB) software. YCSB provides features such as test sets, test process control, and result aggregation. The configuration of AHBench is simple. You can use AHBench to perform benchmark tests with a few clicks.
Preparations
- Deploy the stress testing client on an Elastic Compute Service (ECS) instance. To
ensure network connectivity, make sure that your Lindorm instance and the ECS instance
meet the following requirements. For more information about how to view the information
about an ECS instance, see View instance information.
- Your Lindorm instance and the ECS instance are deployed in the same region. We recommend that you deploy the instances in the same zone to reduce the network latency.
- The network types of your Lindorm instance and the ECS instance are the same.
Note
- We recommend that you select Virtual Private Cloud (VPC) because VPC provides higher security.
- If you set the network types of the instances to VPC, make sure that you specify the same VPC ID for the instances.
- Add the IP address of the stress testing client to the whitelist of the Lindorm instance. For more information, see Configure a whitelist.
- Download AHBench, upload the package to your stress testing client, and then extract the package.
Usage notes
- The system that you benchmark may be overwhelmed and stop responding during the stress test. Do not use AHBench in a production environment.
- AHBench uses the ahbenchtest-read and ahbenchtest-write tables for testing. The tables may be deleted and then recreated during the test. Make sure that the tables can be deleted without causing issues.
- Make sure that the cluster that you want to test has sufficient storage space.
- Lindorm clusters run on ECS instances that provide virtual runtime environments. The performance of clusters that have the same specification may vary in a range of 5% to 10%.
Runtime environment
The runtime environment for the stress testing client must meet the following requirements:
- The client runs Linux.
- JDK 1.8 +
- Python 2.7
- The client has at least 16 exclusive CPU cores.
Specify the cluster endpoint
Specify the endpoint of the Lindorm cluster or ApsaraDB for HBase cluster that you want to benchmark in the AHBench/conf/hbase-site.xml file.
For more information about how to access Lindorm, see Use the ApsaraDB for HBase API for Java to connect to and use the wide table engine LindormTable. Add the following settings to the hbase-site.xml file:
<property>
<name>hbase.client.connection.impl</name>
<value>org.apache.hadoop.hbase.client.AliHBaseUEClusterConnection</value>
</property>
Configure environment variables
- Run the following command to open the ahbench-env.properties file:
vi AHBench/conf/ahbench-env.properties
- Specify the path where the Java Development Kit (JDK) is to be installed. Example:
JAVA_HOME=/usr/java/jdk1.8.0/
. If the JDK is already installed in the system path, skip this step. - # If you want to benchmark an ApsaraDB for HBase cluster, specify the version of the
ApsaraDB for HBase cluster. If the cluster version is 1.x, set the parameter to 1.
If the cluster version is 2.x, set the parameter to 2.
HBASE_VERSION=2
Configure the parameters that are related to stress testing (optional)
# Specify the compression algorithm for the test tables.
# Valid values: NONE, LZO, ZSTD, SNAPPY, GZ, LZ4, and ZSTD.
# Some compression algorithms are not supported by specific test systems.
# We recommend that you select ZSTD for Lindorm.
ahbench.table.compression=SNAPPY
# Specify the encoding algorithm for the test tables. Valid values:
# NONE DIFF INDEX
# Some encoding algorithms are not supported by specific test systems.
# We recommend that you select the INDEX encoding algorithm for Lindorm.
ahbench.table.encoding=DIFF
Perform a stress test
- Quick test
The test dataset includes 10 million entries and occupies at least 20 GB storage. The test duration is about 40 minutes and varies based on the test system.
cd AHBench ./fast_test
- Full test
The test dataset includes 2 billion entries and occupies at least 2 TB storage. The test duration is about 25 hours and varies based on the test system.
cd AHBench ./full_test
If you want to repeat a test, you do not need to import data again. This reduces the duration of the test. If you skip the data import step, the test duration is about 3.5 hours and varies based on the test system.
cd AHBench ./full_test --skipload
Analyze the test results
- View the name of the CSV file.
ls -ltr
- View the content of the CSV file.
cat full_throughput.csv
The following figure shows the content of the CSV file.
FAQ
If AHBench shuts down due to an error, check the following items:
- Check whether JAVA_HOME is set to a valid value and whether the Python runtime environment is installed.
- Check whether the endpoint of the tested cluster is valid.
- Check whether the HBase version of the tested cluster is valid.
- Check whether the tested cluster supports the specified compression algorithm.
- Check whether the tested cluster is running as expected.