All Products
Search
Document Center

Performance Testing:Test analysis and tuning

Last Updated:Mar 11, 2026

Performance testing does not end when the test finishes running. The real value comes from analyzing results, pinpointing bottlenecks, and tuning the system to meet your performance goals. The performance of a system is determined by many factors. This topic does not describe all factors, but provides a guide for analyzing system performance.

The following sections walk through performance analysis and tuning for systems running on Alibaba Cloud. The intended audience includes developers, test administrators, test operators, technical support engineers, project quality administrators, project administrators, and O&M engineers responsible for system performance.

Performance analysis

Prerequisites

Before you start, make sure you have:

  • Monitoring configured: Client-side monitoring in Performance Testing (PTS), infrastructure monitoring through Cloud Monitor, and application-level tracing through Application Real-Time Monitoring Service (ARMS).

  • Technical background: Working knowledge of operating systems, middleware, databases, and application development.

Analysis workflow

Work through the following layers top-down. Most performance issues originate at the network or server layer, so start there before looking at the client side.

image

Step 1: Check the network access layer

In cloud-based architectures, stress testing traffic may not fully reach the backend. Protection policies on Server Load Balancer (SLB), Web Application Firewall (WAF), Anti-DDoS IP addresses, Content Delivery Network (CDN) points of presence (POPs), or Edge Security Acceleration (ESA) POPs can intercept traffic when:

  • Bandwidth, maximum connection, or new connection limits are exceeded.

  • Traffic patterns resemble Challenge Collapsar (CC) or DDoS attacks.

If you see unexpected errors or timeouts despite low backend load, check these services first. For details, see Why do errors or timeouts occur even under the small backend load?

Step 2: Validate metrics

Check whether response time, throughput, and error rate meet your targets. If any metric falls outside the acceptable range, the issue almost always originates on the server side rather than the client side.

Step 3: Inspect hardware metrics

On the server, check CPU utilization, memory usage, disk I/O, and network I/O. If any of these are abnormal, perform a deeper investigation on the affected resource (see Analysis methods below).

Step 4: Inspect middleware metrics

If hardware metrics are normal, check middleware-level indicators: thread pool utilization, connection pool usage, and garbage collection (GC) frequency and duration.

Step 5: Inspect database metrics

If middleware metrics are normal, investigate database performance: slow SQL queries, cache hit rates, lock contention, and database parameter settings.

Step 6: Inspect application logic

If all infrastructure metrics are normal, the bottleneck is likely in application code. Investigate algorithms, buffering and caching strategies, and synchronous vs. asynchronous I/O patterns.

Common bottleneck categories

Hardware and specifications

CPU, memory, and disk I/O are the most common hardware bottlenecks. Undersized instances or specification limits can cap throughput before any software-level issue becomes visible.

Middleware

Database systems and application servers (including web servers) often become bottlenecks due to misconfiguration. For example, improper Java Database Connectivity (JDBC) connection pool settings on a Weblogic platform can throttle concurrent request handling.

Application code

Common application-level bottlenecks include:

  • Suboptimal Java Virtual Machine (JVM) parameters or container settings

  • Slow SQL queries (identifiable through APM services such as ARMS)

  • Poorly designed database schemas or application architecture

  • Serial processing where parallel processing is possible

  • Missing buffer or cache layers

  • Insufficient request processing threads

  • Uncoordinated producer-consumer patterns

Operating system

On Windows, UNIX, or Linux systems, OS-level misconfigurations can degrade performance. For example, when physical memory is insufficient and virtual memory settings are not properly configured, excessive swap activity significantly increases response times.

Network devices

Firewalls, load balancers, switches, and cloud network services (SLB, WAF, Anti-DDoS IP addresses, CDN POPs, and ESA POPs) can introduce bottlenecks. For example, if a load balancer fails to distribute traffic across servers when one server reaches capacity, the load balancer configuration is the bottleneck.

Analysis methods

CPU

CPU utilization breaks down into three categories, each pointing to a different root cause:

High CPU User

An application-level process is consuming excessive CPU.

  1. Run top to identify the process with highest CPU usage.

  2. Run top -H -p <pid> to narrow down to the specific thread.

  3. For Java applications, use jstack to capture the thread stack trace and identify the CPU-intensive method.

  4. For C++ applications, use gprof to profile execution.

  5. Review the source code at the identified location.

High CPU Sys

The kernel is consuming excessive CPU, typically due to expensive system calls.

  • Use strace to trace system calls and identify which calls consume the most time.

High CPU Wait

The CPU is idle while waiting for I/O to complete. This is typically caused by heavy disk read/write activity.

  • Reduce log output volume.

  • Switch to asynchronous I/O.

  • Upgrade to disks with higher IOPS performance.

Memory

Operating systems use spare memory for disk caching, so memory utilization near 99% is normal. Instead, watch for:

  • A single process consuming a disproportionately large amount of memory.

  • High swap activity, which indicates the system is running out of physical memory.

Disk I/O

The most important disk I/O metric is the busy percentage. To reduce it:

  • Decrease log write volume.

  • Use asynchronous I/O.

  • Upgrade to disks with higher IOPS.

Network I/O

Network throughput depends on payload size. Keep utilization below 70% of the hardware's maximum capacity. To improve network I/O:

  • Compress response payloads.

  • Enable caching on compute nodes.

  • Batch smaller transmissions into fewer, larger transfers.

Kernel parameters

Default kernel parameter values work for most workloads, but stress testing can exceed these defaults. Use sysctl to view and modify kernel parameters as needed.

JVM

Monitor GC and full GC frequency and duration:

  1. Run jstat to check GC statistics.

  2. If GCs are too frequent, use jmap to dump heap memory.

  3. Analyze the dump with HeapAnalyzer to identify high memory consumption and potential memory leaks.

Alternatively, use an APM tool such as ARMS for a visual, real-time view of JVM metrics.

Thread pools

If thread pools are saturated, increase the pool size. If a larger pool still does not help, investigate deeper:

  • Threads blocked waiting for locks.

  • Methods with long execution times.

  • Database queries with long wait times.

JDBC connection pools

If connection pools are exhausted, increase the pool size. However, if the underlying database is slow, more connections will not help. Check for:

  • Slow queries that hold connections longer than necessary.

  • Code paths that fail to release connections back to the pool.

SQL

Inefficient SQL is one of the most common causes of poor performance. Check the execution plan to understand why a query is slow. The following table lists common SQL performance issues.

Index issues

Problem

Example

Impact

No index

N/A

Full table scan

Function on indexed column

substring(card_no,1,4)='5378'

Full table scan

Expression on indexed column

amount/30<1000

Full table scan

Type conversion on indexed column

convert(char(10),date,112)='19991201'

Full table scan

Inequality operator

where salary<>3000

Full table scan

Leading wildcard

name like '%Tom'

Full table scan

Concatenation in WHERE clause

first_name + last_name ='beill cliton'

Full table scan

IN with small value list

id_no in('0','1')

Full table scan

Parameterized query

select id from t where num=@num

Full table scan even with parameters

Nonclustered index with ORDER BY

N/A

Poor index performance

String vs. integer index

username='Tom' and age>20

String indexes are slower than integer indexes

Nullable columns

Columns with NULL values

Poor index performance

IS NULL / IS NOT NULL

N/A

Poor index performance

Data volume issues

Problem

Example

Impact

SELECT *

select *

Retrieves all columns unnecessarily

Large table without filtering

select id,name on millions of rows

Large data volume

Nested query without early filtering

Filter after full data load

Processes unnecessary data

Multi-table join without selective predicates

Join then filter

Excessive join operations

Bulk insert

Insert all data at once

Generates excessive logs, high resource usage

Lock and concurrency issues

Problem

Example

Impact

Row-level lock escalation

update account set balance=100 where id=10

May lock the entire table

Deadlocks

A: update a; update b; B: update b; update a;

Mutual blocking

Cursors

Open cursor, fetch; close cursor

Low performance

Temporary tables (CREATE)

create tmp table

Generates excessive logs

Temporary tables (DROP)

DROP TABLE

Must confirm deletion to prevent prolonged locking

Query optimization tips

Instead of

Use

Why

IN

EXISTS

EXISTS stops scanning once a match is found

select count(*)

EXISTS

EXISTS stops after finding one row

IN (sequential values)

BETWEEN

BETWEEN checks a range instead of individual values

NOT IN

LEFT JOIN ... IS NULL

NOT IN scans every row in the subquery

UNION

UNION ALL

UNION ALL skips deduplication and sorting

Hardcoded SQL

Parameterized (bound) SQL

Compile once, reuse the execution plan

Tuning

Performance tuning is iterative. As applications evolve and user loads grow, regular testing and tuning keeps the system within acceptable performance boundaries.

Tuning workflow

1. Identify the issue

Narrow down the problem area:

  • Application code: Code-level issues are the most common source of performance problems. Check here first.

  • Database settings: Misconfigured databases can slow the entire system. For large databases, have a database administrator (DBA) review parameter settings before going to production.

  • Operating system settings: Misconfigured OS parameters can introduce system-level bottlenecks.

  • Hardware: Disk I/O and memory are the most common hardware constraints.

  • Network: Overloaded networks cause packet loss and latency spikes.

2. Analyze the issue

Once you identify the problem area, determine its scope:

  • Does the issue affect response time, throughput, or both?

  • Are all users affected, or only a subset? What distinguishes the affected users?

  • Are system resource metrics (CPU, memory, I/O) at or near their limits?

  • Is the issue concentrated in specific modules or endpoints?

  • Is the issue on the client side or the server side?

  • Does the actual load exceed the system's designed capacity?

3. Define goals and solutions

Improve the system throughput and shorten the response time to better support your workloads. Translate these goals into PTS stress testing scenarios with specific load levels, then select the appropriate mode: concurrency-based, transactions per second (TPS)-based, or a combination of automatic increment and manual regulation for traffic throttling.

4. Test the solution

Run benchmark tests after each change. Benchmark testing provides quantitative, comparable measurements of specific performance metrics, giving you an objective way to evaluate whether the change helped.

5. Evaluate results

After each tuning iteration, evaluate:

  • Did the change meet or exceed the performance goal?

  • Did it improve overall system performance, or only a specific component?

  • Are further tuning iterations needed?

If the goals are met, the tuning cycle is complete.

Best practices

  • Design for performance from the start. Tuning compensates for design gaps but cannot replace good architecture. Factor performance requirements into design and development early.

  • Define clear performance goals. Translate goals into PTS test scenarios with specific load levels, then select the appropriate mode: concurrency-based, TPS-based, or a combination of automatic increment and manual regulation for traffic throttling.

  • Validate after every change. Run regression tests after each tuning iteration to confirm the change works as expected and does not introduce regressions.

  • Integrate performance testing into your workflow. Run intranet performance tests regularly during development. Conduct business performance tests periodically in the production environment.

  • Keep tuning iterative. Feed results from each cycle back into later development. Performance work is never one-and-done.

  • Protect code quality. Do not sacrifice readability or maintainability for performance. Optimizations that make code harder to maintain create long-term costs that outweigh short-term gains.

Other test analysis

Success rate

Success rate is determined by server return values and any assertions you configure. Without assertions, a request is marked as failed when the backend returns an error code, the server throws an exception, or the request times out.

Logs

PTS logs record details for each sampled request. At a 10% sampling rate, PTS records 10 out of every 100 requests. At 100%, every request is recorded.

Trade-off: Higher sampling rates give you more diagnostic detail but increase the load on load generators, reducing their performance and raising costs. The sampling rate does not affect the server under test.

Connection establishment

Connection establishment is the process of setting up an HTTP connection between the load generator and the server. The request timeout covers the entire span from DNS resolution to response completion. If this duration exceeds the configured timeout threshold, the request is marked as timed out.