All Products
Search
Document Center

Elastic Compute Service:What do I do if CPU utilization is high on a Windows ECS instance?

Last Updated:Oct 21, 2024

This topic describes how to resolve a high-CPU-utilization issue on a Windows Elastic Compute Service (ECS) instance.

Problem description

The CPU utilization of a Windows ECS instance is higher than or equal to 80%.

Causes

The Windows ECS instance may experience high CPU utilization due to one of the following reasons:

  • The ECS instance is infected by viruses or attacked by trojans.

  • Third-party antivirus software runs on the ECS instance.

  • An exception occurs in an application or a driver on the ECS instance, or an application on the ECS instance has a high I/O usage or a high interrupt rate.

Solution

Step 1: Identify the issue

Use Microsoft tools, such as Task Manager and Resource Monitor, to capture full memory dumps and identify the high-CPU-utilization issue. In high-traffic scenarios, you can use Wireshark to capture network packets for a period of time and analyze traffic patterns.

This section describes how to use Resource Monitor of Windows Server 2022 to identify a high-CPU-utilization issue. For information about other commonly used tools, see the Common tools section of this topic.

  1. Connect to the ECS instance by using Virtual Network Computing (VNC).

    For more information, see Connect to an instance by using VNC.

  2. In the lower part of the desktop, click the Start icon and select Run.

  3. In the Run dialog box, enter perfmon -res and click OK.

    image

  4. In the Resource Monitor window, check for processes that cause high CPU utilization.

    image

  5. Find the IDs and names of the processes that consume a large amount of CPU resources.

  6. Open the Task Manager window, click the Details tab, and then find the processes that contribute to high CPU utilization based on the process names and process IDs (PIDs) that you obtained in the Resource Monitor window. Right-click the name of each process that contributes to high CPU utilization, select Open file location, and then check whether the process is a malicious process.

Step 2: Analyze and resolve the issue

Determine whether the processes that cause high CPU utilization are normal, and perform operations to resolve the issue. The following table describes the operations that you need to perform based on whether a process is normal or abnormal.

Possible cause

Operation

Normal processes

Services that are frequently accessed and Windows in-box services, such as update services, may cause high network traffic or high CPU load.

Note
  • For a Windows Server 2008 or Windows Server 2012 instance, we recommend that you configure at least 2 GiB of memory.

  • On a Windows Server 2012 instance that has one vCPU and 1 GiB of memory, the Windows Update service automatically checks for, downloads, and installs new Windows updates, which results in sudden spikes in CPU utilization. This is a normal scenario.

  • Check whether Windows Update operations are performed in the background.

  • We recommend that you install antivirus software on the instance to perform a virus scan. If antivirus software is installed on the instance, check whether the antivirus software runs in the background when the instance experiences high CPU utilization. If possible, upgrade the antivirus software to the latest version, or uninstall the antivirus software.

  • Check whether applications that are hosted on the instance involve large numbers of disk read/write operations, initiate large numbers of network requests, or generate compute-intensive workloads. Upgrade to an instance type that has more vCPUs or memory to resolve the resource bottleneck. For more information, see Overview of instance configuration changes.

  • If the current instance type provides high configurations, an instance type upgrade may not resolve the issue of high CPU utilization and higher configurations may not have architecture benefits. In this case, move applications to other instances to free up resources on the Windows instance, and optimize applications.

    For example, you can migrate databases to ApsaraDB RDS instances. To optimize applications, you can modify the application configurations, such as the number of connections, cache settings, web settings, and the parameters used to call databases.

Abnormal processes

High CPU utilization may be caused by viruses or trojans. Malicious third-party applications may exploit svchost.exe or tcpsvcs.exe in the operating system to disguise themselves and consume excessive CPU resources. You must check for and terminate abnormal processes.

Note

If you cannot determine whether a process is a virus or a trojan, we recommend that you search the process name on the Internet. Before you terminate abnormal processes, we recommend that you create snapshots for the instance to back up instance data. For more information, see Create a snapshot for a disk.

  • Use a commercial version of antivirus software or the free scan tool Microsoft Safety Scanner to scan for and remove viruses in safe mode.

  • Run Windows Update to install the latest Microsoft security patches.

  • Use MSConfig to disable all drivers except Windows in-box drivers. For more information, see How to perform a clean boot in Windows.

  • A server or a website is overloaded by a large number of access requests when the server or the website suffers a DDoS attack or HTTP flood attack. You can log on to the Security Center console to check anti-DDoS thresholds and to check whether HTTP flood protection is enabled. If no attacks hit the thresholds, Security Center does not perform traffic scrubbing. Contact Alibaba Cloud technical support to scrub traffic.

Common tools

This section describes common Windows in-box tools that are used to identify a high-CPU-utilization issue.

Task Manager

Task Manager allows you to view the lists of applications and processes and identify applications that cause high CPU utilization. The following figure shows the Task Manager window.

image

When you check CPU utilization on the Performance tab, right-click the CPU graph and choose Change graph to > Logical processors.

Two graphs that show the utilization of two logical processors appear, as shown in the following figure.

image

When the CPU utilization of a process spikes to nearly 100% and the CPU utilization of other processes has insignificant change, a network I/O issue may occur.

Resource Monitor

Resource Monitor allows you to visually check CPU utilization and search for processes based on handles and modules.

image

Process Explorer

Process Explorer is part of the Microsoft Sysinternals suite. You can configure symbols to check thread call stacks of applications and identify potential anomalous drivers. You can download Process Explorer from Process Explorer.

The following figure shows the Process Explorer window.

image

Performance Monitor

Performance Monitor allows you to collect performance counters for various components. Multiple counters are used to monitor the consumption of CPU resources.

Take note of the following critical performance counters:

  • \Processor(_Total)\% Processor Time: This performance counter indicates the percentage of time that the processor spends in executing non-idle threads. \Processor(_Total)\% Processor Time=\Processor(*)\% User Time+\Processor(*)\% Privileged Time.

  • \Processor(*)\% User Time: This performance counter indicates the percentage of time that the processor spends in running code in user mode. The performance counter can help you identify the applications or functions on which the processor spends a significant amount of time.

  • \Processor(*)\% Privileged Time: This performance counter indicates the percentage of time that applications spend in executing system calls in kernel (or privileged) mode, such as drivers, I/O request packets (IRPs), and context switching. If the value of \Processor(*)\% Privileged Time performance counter of the operating system exceeds 30%, the instance spends a significant amount of time in processing I/O requests.

    If the value of \Processor(*)\% Privileged Time is large, check the % DPC Time, % Interrupt Time, and Context Switches/sec performance counters.