System events are used to record and notify cloud resource information, such as O&M task executions, resource exceptions, and resource status changes. You can use system events to obtain information about risks and anomalies of Elastic Compute Service (ECS) resources. For example, a system event is generated when an instance needs to be migrated due to underlying upgrades or when an instance is restarted due to system maintenance. You can respond to and handle the system event at the earliest opportunity to prevent your business from being affected by ECS resource unavailability or performance damage. This topic summarizes the system events supported by ECS, including scheduled O&M events, unexpected O&M events, instance billing events, and instance status change events. This topic also provides suggestions on how to handle the system events.
If Undefined is displayed in the Event code column of a system event, the system event is not displayed in the ECS console and cannot be queried by calling API operations. Example: Undefined is displayed in the Event code column of the Instance:StateChange event.
Scheduled O&M events
If you perform a restart operation within the operating system of an instance that encounters a system event, the maintenance action corresponding to the event cannot take effect. The instance restart operations that are described in this topic are all performed in the ECS console or by calling an API operation. For more information, see Restart an instance or RebootInstance.
Event code | Event name | Event severity level | CloudMonitor event | Event description and impact | Handling suggestion |
SystemMaintenance.Reboot | Instance Restart Due to System Maintenance | Serious |
| This system event is triggered 24 to 48 hours before the scheduled time of system maintenance when Alibaba Cloud detects a potential risk of hardware or software failure in the underlying host of an instance and the risk can cause instance restarts. | We recommend that you take one of the following actions in response to the event:
Note
|
SystemMaintenance.Stop | Instance Stopped Due to System Maintenance | Serious |
| This system event is triggered 24 to 48 hours before the scheduled time of system maintenance when Alibaba Cloud detects a potential risk of hardware or software failure in the underlying host of an instance and the risk can cause the instance to stop. | We recommend that you take one of the following actions in response to the event:
Note You can modify the maintenance attributes of the instance to specify the default action that takes effect when the instance encounters a maintenance event. For more information, see Modify instance maintenance attributes. |
SystemMaintenance.Redeploy | Instance Redeployment Due to System Maintenance | Serious |
| This system event is triggered 24 to 48 hours before the scheduled time of system maintenance when Alibaba Cloud detects a potential risk of hardware or software failure in the underlying host of an instance and the risk can cause instance redeployment. Important If the instance is equipped with local SSDs or local HDDs, the data disks are re-initialized and the data is cleared. | We recommend that you make preparations such as modifying the /etc/fstab configuration file and backing up data, and then take one of the following actions in response to the event:
Note
|
SystemMaintenance.IsolateErrorDisk | Isolation of Damaged Local Disks Due to System Maintenance | Serious |
| This system event is immediately triggered when Alibaba Cloud detects hardware damage or software damage on a local disk of an instance. Important The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired. | We recommend that you make preparations such as modifying the /etc/fstab configuration file and backing up data, and then select an appropriate point in time to authorize the damaged disk to be isolated. Then, the local disk is isolated online without the need to restart its associated instance. Note For more information, see the "Scenario ③" section of the O&M scenarios and system events for instances equipped with local disks topic. |
SystemMaintenance.ReInitErrorDisk | Re-initialization of Damaged Local Disks Due to System Maintenance | Serious |
| This system event is immediately triggered when Alibaba Cloud isolates and replaces a local disk on the host of an instance after Alibaba Cloud detects hardware damage or software damage in the local disk. In most cases, Alibaba Cloud isolates and replaces a damaged local disk within five business days after you authorize Alibaba Cloud to isolate the local disk. Important The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired. | We recommend that you select an appropriate point in time to authorize the local disk to be restored. Then, the local disk is restored online without the need to restart its associated instance. Note For more information, see the "Scenario ③" section of the O&M scenarios and system events for instances equipped with local disks topic. |
SystemMaintenance.RebootAndIsolateErrorDisk | Isolation of Damaged Local Disks and Instance Restart Due to System Maintenance | Serious |
| This system event is immediately triggered when Alibaba Cloud detects hardware damage or software damage on a local disk of an instance and fails to isolate the local disk online. Important The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired. | We recommend that you select an appropriate point in time to authorize the damaged disk to be isolated and restart the associated instance after the disk is isolated. In this case, the local disk is isolated offline, so you must restart its associated instance for the isolation operation to take effect. Note For more information, see the "Scenario ③" section of the O&M scenarios and system events for instances equipped with local disks topic. |
SystemMaintenance.RebootAndReInitErrorDisk | Re-initialization of Damaged Local Disks and Instance Restart Due to System Maintenance | Serious |
| This system event is immediately triggered when Alibaba Cloud detects hardware damage or software damage on a local disk of an instance and fails to restore the local disk online. Important The procedure for handling a damaged local disk of an instance varies based on the instance type. For specific instance types, the instance must be restarted and the damaged local disk must be isolated. For other instance types, the damaged local disk can be isolated online and then repaired. | We recommend that you select an appropriate point in time to authorize the local disk to be restored and restart the associated instance after the disk is restored. In this case, the local disk is restored offline, so you must restart its associated instance for the restoration operation to take effect. Note For more information, see the "Scenario ③" section of the O&M scenarios and system events for instances equipped with local disks topic. |
SystemMaintenance.StopAndRepair | In-place Repair of Instance Equipped With Local Disks | Serious |
| This system event is triggered 48 to 168 hours before the scheduled time of system maintenance when Alibaba Cloud detects a risk of hardware failure in the underlying host of an instance. | We recommend that you select an appropriate point in time to authorize Alibaba Cloud to repair or redeploy the instance that is equipped with local disks. Note For more information, see O&M scenarios and system events for instances equipped with local disks. |
SystemMaintenance.CleanReleasedDisks | Disk Cleanup after EBS Disk Hot Swapping Failure | Warn |
| This system event is triggered when Alibaba Cloud detects the configurations of one or more disks that were released due to overdue payments in the operating system of an instance. | We recommend that you select an appropriate point in time to authorize Alibaba Cloud to clear the configurations of the released disks. Important Alibaba Cloud stops the instance at the specified point in time and then clears the configurations of the disks. After the disk cleanup, the instance is started again. |
Unexpected O&M events
Event code | Event name | Event severity level | CloudMonitor event | Event description and impact | Handling suggestion |
SystemFailure.Reboot | Instance Restart Due to System Error | Serious |
| This system event is immediately triggered when Alibaba Cloud detects that an instance is restarted due to hardware or software failure in the underlying host, such as CPU or memory hardware damage. | We recommend that you wait until the instance is automatically restarted and then check whether the instance and applications work as expected. When the instance is being restarted, Alibaba Cloud migrates the instance to a healthy host. Note You can modify the maintenance attributes of the instance to specify the default action that takes effect when the instance encounters a maintenance event. For more information, see Modify instance maintenance attributes. |
InstanceFailure.Reboot | Instance Restart Due to OS Error | Serious |
| This system event is immediately triggered when Alibaba Cloud detects that an instance operating system is down due to issues such as out-of-memory (OOM), blue screen, freeze, continuous printing of serial port logs, and kernel panic. | We recommend that you wait until the instance is automatically restarted and then check whether the instance and applications work as expected. You can enable the kdump service of the operating system to troubleshoot the issue and prevent the issue from recurring. For more information, see How to enable the Kdump service for Linux instances and Enable the Kernel Memory Dump feature for a Windows instance. |
SystemFailure.Stop | Instance Stop Due to System Error | Serious |
| This system event is immediately triggered when Alibaba Cloud detects that an instance is stopped due to hardware or software failure in the underlying host, such as CPU or memory hardware damage. | We recommend that you wait until the instance is automatically restarted and then start the instance. When the instance is being started, Alibaba Cloud migrates the instance to a healthy host. Note You can modify the maintenance attributes of the instance to specify the default action that takes effect when the instance encounters a maintenance event. For more information, see Modify instance maintenance attributes. |
SystemFailure.Redeploy | Instance Redeployment Due to System Error | Serious |
| This system event is immediately triggered when Alibaba Cloud detects hardware or software failure in the underlying host of an instance equipped with local disks and the instance must be redeployed. Note Only instances that depend on host hardware support this type of event, such as instances that are equipped with local disks or support Software Guard Extensions (SGX) confidential computing. | We recommend that you make preparations such as modifying the /etc/fstab configuration file and backing up data, and then take one of the following actions in response to the event:
Note You can modify the maintenance attributes of the instance to specify the default action that takes effect when the instance encounters a maintenance event. For more information, see Modify instance maintenance attributes. |
SystemFailure.Delete | Automatic Cancellation of Bills Due to Instance Creation Failures | Serious |
| This system event is immediately triggered when Alibaba Cloud detects that an instance creation order is placed but the instance fails to be created. | We recommend that you wait for the instance to be automatically released. In most cases, an instance is automatically released within 5 minutes after the instance fails to be created. Note If you already paid for the order, the payment is refunded after the instance is released. To ensure that instances can be created, we recommend that you take the following actions:
|
ErrorDetected | Local Disk Fault Alarm | Serious |
| This system event is immediately triggered when Alibaba Cloud detects hardware or software failure in the local disk of an instance and data cannot be read from the disk or written to the disk. | We recommend that you make preparations such as modifying the /etc/fstab configuration file and backing up data. Then, select an appropriate point in time to isolate and restore the damaged local disk. The supported operations vary based on the instance type.
Note For more information, see the "Scenario ③" section of the O&M scenarios and system events for instances equipped with local disks topic. |
Stalled | Severe Impacts on Disk Performance | Serious |
| This system event is immediately triggered when Alibaba Cloud detects that an I/O hang occurs on a disk of the instance, which significantly affects the disk performance and prevents the disk from handling read and write requests. | We recommend that you isolate reads and writes on the disk at the application layer or disassociate the ECS instance from the associated Server Load Balancer (SLB) instance. |
Instance migration events due to upgrades at the underlying layer
Event code | Event name | Event severity level | CloudMonitor event | Event description and impact | Handling suggestion |
SystemUpgrade.Migrate | Instance Migration Events Due to Upgrades at Underlying Layer | Serious | Undefined | This system event is triggered when instances are affected by the upgrades and improvements of physical infrastructure in regions and zones where these instances reside. | We recommend that you view event details in the ECS console and migrate affected instances as prompted. For more information, see Instance migration due to upgrades at the underlying layer. |
Performance limited events of burstable instances
Event code | Event name | Event severity level | CloudMonitor event | Event description and impact | Handling suggestion |
Instance:BurstablePerformanceRestricted | Limited Performance of Burstable Instance | Warn | Instance:BurstablePerformanceRestricted: Burstable Instance Performance Limited | This system event is triggered when all accrued CPU credits of a burstable instance are consumed. | We recommend that you take one of the following actions in response to the event:
If you want to specify thresholds for triggering notifications about this event, for example, if you want an event notification to be sent when accrued CPU credits remain less than 10 for 10 consecutive minutes, you can configure event-triggered alert rules for the event in the CloudMonitor console. For more information, see Monitor burstable instances. |
State change events
Event code | Event name | Event severity level | CloudMonitor event | Event description and impact | Handling suggestion |
Instance:PreemptibleInstanceInterruption | Preemptible Instance Interruption | Warn | Instance:PreemptibleInstanceInterruption: Notification on Preemptible Instance Interruption | This system event is triggered 5 minutes before a preemptible instance is reclaimed. | We recommend that you take one of the following actions:
|
Instance:ModifyInstanceSpec.Reboot | Instance Restart Due to Instance Type Change | Serious |
| After the instance type of an instance is changed, you must restart the instance to make the new instance type take effect. If you do not restart the instance within seven days after the new order takes effect, the system will forcefully restart the instance to make the new instance type take effect. | We recommend that you take one of the following actions:
|
Instance:PerformanceModeChange | Performance Mode Switchover of Burstable Instance | Warn | Instance:PerformanceModeChange: Performance Mode Switchover of Burstable Instance | This system event is triggered when a burstable instance switches between the unlimited mode and the standard mode. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Instance:StateChange | Instance Status Change | Notification | Instance:StateChange: Notification on Instance Status Change | This system event is triggered when the state of an instance changes, such as from Running to Stopping or from Stopping to Stopped. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Instance:AutoReactivateCompleted | Automatic Restart Completed | Notification | Instance:AutoReactivateCompleted: Automatic Restart Completed | This system event is triggered when you settle overdue payments in your account and an instance is automatically restarted. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Instance:LiveMigrationAcrossDDH | Instance Hot Migration Between Dedicated Hosts | Notification | Instance:LiveMigrationAcrossDDH: Instance Hot Migration Between Dedicated Hosts | This system event is triggered when an instance is hot migrated between dedicated hosts. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Disk:DiskOperationCompleted | Disk Operations Completed | Notification | Disk:DiskOperationCompleted: Disk Operations Completed | This system event is triggered when a pay-as-you-go disk is manually attached or detached. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Disk:ConvertToPostpaidCompleted | Billing Method of Disks Switched to Pay-as-you-go | Notification | Disk:ConvertToPostpaidCompleted: Billing Method of Disks Switched to Pay-as-you-go | This system event is triggered when a subscription disk is changed to a pay-as-you-go disk. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Snapshot:CreateSnapshotCompleted | Disk Snapshot Created | Notification | Snapshot:CreateSnapshotCompleted: Disk Snapshot Created | This system event is triggered when a snapshot is created for a disk. | We recommend that you determine whether to monitor the event. If you want to monitor the event, you can configure notifications for the event in the CloudMonitor console. For more information, see Subscribe to ECS system event notifications. |
Snapshot:SnapshotDeleted | Snapshot Deletion Completed | Notification | Snapshot:SnapshotDeleted: Snapshot Deleted | This system event is generated when a manual or automatic snapshot is deleted. | None |