Subscribe to Cloud Assistant events to build automated O&M response flows. For example, you can receive immediate alerts when automated tasks, such as installing software or running inspection scripts, fail. This replaces costly, high-latency manual polling.
Procedure
The following procedure uses the subscription to Cloud Assistant task status events as an example. For more information, see Cloud Assistant event descriptions.
Use EventBridge to subscribe to Cloud Assistant events
Before you begin, make sure that you have activated EventBridge and granted the required permissions.
Log on to the EventBridge console. In the navigation pane on the left, click Event Buses.
In the top menu bar, select a region.
On the Event Buses page, click default.
Click Event Rules in the navigation pane on the left, and then click Create Rule.
On the Configure Basic Info tab, enter a name in the Name text box and a description in the Description text box. Then, click Next Stept.
On the Configure Event Pattern tab, configure the following parameters and then click Next Step.
From the Event Source drop-down list, select acs.ecs.
From the Event Type drop-down list, select the Cloud Assistant event type to which you want to subscribe.
Cloud Assistant task status event:
ecs:CloudAssistant:TaskCompleted.In the Event Pattern Debugging section, view a sample of the subscribed event type.
{ "id": "45ef4dewdwe1-7c35-447a-bd93-fab****", "source": "acs.ecs", "specversion": "1.0", "subject": "acs.ecs:cn-hangzhou:123456789098****:215672", "time": "2020-11-19T21:04:41+08:00", "type": "ecs:CloudAssistant:TaskCompleted", "aliyunaccountid": "123456789098****", "aliyunpublishtime": "2020-11-19T21:04:42Z", "aliyuneventbusname": "default", "aliyunregionid": "cn-hangzhou", "aliyunpublishaddr": "172.25.XX.XX", "data": { "commandId": "c-hz045**********", "commandName": "hello-linux.sh", "exitCode": "0", "finishTime": "2023-12-14T07:39:48Z", "instanceId": "i-bp114***************", "invocationStatus": "Success", "invokeId": "t-hz045**********", "ownerId": "158*************", "playerUid": "256***************", "repeatMode": "Once", "repeats": "1", "startTime": "2023-12-14T07:39:48Z", "errorCode": "0", "errorDesc": "" } }Below the sample, click Test to simulate an event trigger. If the message Match Succeeded. The event can be triggered as expected appears, the event is configured to trigger as expected.
Configure an event target. Select a Service Type and configure a push scenario.
For more information, see Set push scenarios.
Use CloudMonitor to subscribe to Cloud Assistant events
Log on to the CloudMonitor console.
In the navigation pane on the left, choose .
On the Subscription Policy tab, click Create Subscription Policy.
On the Create Subscription Policy page, configure the parameters to subscribe to Cloud Assistant events.
This example shows only the parameters related to Cloud Assistant events. For more information, see Subscription policy parameters.
Subscription Type: Select System Events.
Subscription Scope:
Product: Select Elastic Compute Service (ECS).
Event Type: Select Notifications.
Event Name: Select Cloud Assistant task status event.
After you complete the configuration, click Submit.
When a relevant event is triggered, you will receive a notification. You can also call the DescribeSystemEventAttribute operation to query the details of system events.
Cloud Assistant event descriptions
Cloud Assistant task status events
Event description
Commands and scripts take time to run. Cloud Assistant task status events help you track task completion. You can use these events for the following purposes:
Receive notifications when Cloud Assistant tasks fail or are complete. You can use the notifications for alerts or subsequent operations.
Polling consumes your API call quota. To avoid this, you can use event subscription.
Prevent interruptions from application releases, which can occur during long polling processes. Using events simplifies your workflow.
Triggering conditions and limits
Triggering conditions: When you call the RunCommand or InvokeCommand operation to run a task, Cloud Assistant monitors the task status. It sends a task status event when the task is complete.
Limits:
A Cloud Assistant task status event is sent only when a task on an ECS instance enters one of the following final states (InvocationStatus):
Aborted: The task failed to send.
Success: The task was successful.
Failed: The task failed.
Invalid: The task content is invalid.
Timeout: The task timed out.
Cancelled: The task was canceled.
Terminated: The task was terminated.
The DescribeInvocations and DescribeInvocationResults operations return data in the
array<object>format. However, a task status event reports the status of a single task on a single instance, not multiple tasks.
Event fields
Field | Description | Example |
instanceId | The instance ID. | i-bp114*************** |
invokeId | The command execution ID. | t-hz045********** |
commandId | The command ID. | c-hz045********** |
commandName | The command name. | ACS-ECS-ResetPassword-for-linux.sh |
ownerUid | The account that owns the instance on which the command is run. | 158************* |
playerUid | The ID of the account that assumes a role to run the command. | 256*************** |
repeatMode | The execution mode of the command. This parameter is ignored if `InstanceId` is also specified. Valid values:
| Once |
repeats | The number of times the command has been run on the instance.
| 0 |
invocationStatus | The execution status of the command.
| Success |
exitCode | The exit code of the command process. | 0 |
startTime | The start time of the task. | 2023-12-20T06:15:55Z |
finishTime | The end time of the task. | 2023-12-20T06:15:59Z |
errorCode | The error code returned if the command failed to be sent or run. | 0 |
errorDesc | The details of the failure to send or run the command. | - |
Cloud Assistant first heartbeat events
Event description
Cloud Assistant heartbeats are one way to determine the status of an instance's operating system. The first heartbeat indicates when the operating system has started. You can use this information to check instance health or decide when to send Cloud Assistant commands.
Using first heartbeat events instead of polling the DescribeCloudAssistantStatus operation solves the following problems:
Polling DescribeCloudAssistantStatus to check if the status has changed to `true` is complex. An improper polling interval can generate too many requests, which can trigger throttling or place a strain on the system.
The startup time for an instance's operating system can vary greatly. Some Windows instances can take up to 5 minutes to start, which makes it difficult to control the total polling duration.
The status returned by DescribeCloudAssistantStatus can have a delay. There is a 2-minute lag between when a heartbeat stops and when the status changes. This makes it difficult for DescribeCloudAssistantStatus to detect instance restarts.
Triggering conditions and limits
Triggering conditions: When Cloud Assistant reports a heartbeat, it sends a first heartbeat event if it detects that this is the first heartbeat after the Cloud Assistant client has started.
Cloud Assistant version limits:
Windows instances: The Cloud Assistant Agent version must be later than 1.0.0.149.
Linux instances: The Cloud Assistant Agent version must be later than 1.0.2.569.
Older versions of Cloud Assistant do not report heartbeats every minute or do not report the index field. As a result, they cannot accurately identify the first heartbeat after startup. These older versions are not supported.
Event fields
Field | Description | Example |
bizEventId | The event ID. | ea33c3e2-aaf0-****-****-5d49b1ecce99 |
vmName | The ID of the instance associated with the event. | i-bp19**************** |
extensions | Additional business information. | - |
azone | The zone. | cn-shenzhen-e |
region | The region. | cn-shenzhen |
agentVersion | Cloud Assistant Agent version. | 2.2.3.529 |
uptime | The amount of time the operating system has been running, in milliseconds. | 19000 |
Cloud Assistant task execution output delivery result events
Event description
When you run a command, a maximum of 24 KB of the command output is retained. Any output that exceeds this limit is truncated.
If you want to obtain the complete output or persist the output, you can configure the output to be delivered to an OSS path when the command execution reaches a final state.
You can use this event for the following purposes:
Receive notifications and details about the output delivery. When you receive a success notification, you can download the output file from the corresponding OSS bucket. This avoids polling the DescribeInvocations operation to obtain results and improves efficiency.
When the delivery fails, you can obtain the detailed reason for the failure from the event.
Triggering conditions and limits
Triggering conditions: When you use RunCommand or InvokeCommand to run a task and specify a valid OssOutputDelivery parameter, this event is sent when the task reaches a final state.
Limits:
An event is sent only when a task on an instance enters one of the following final states (InvocationStatus):
Aborted: The task failed to send.
Success: The task was successful.
Failed: The task failed.
Invalid: The task content is invalid.
Timeout: The task timed out.
Cancelled: The task was canceled.
Terminated: The task was terminated.
Cloud Assistant version limits:
Windows instances: The Cloud Assistant Agent version must be later than 2.1.4.1007.
Linux instances: The Cloud Assistant Agent version must be later than 2.2.4.1007.
Event fields
Field | Description |
instanceId | The instance ID. |
invokeId | The command execution ID. |
ownerUid | The account that owns the instance on which the command is run. |
playerUid | The ID of the account that assumes a role to run the command. |
repeatMode | The execution mode of the command. Valid values:
|
repeats | The number of times the command has been run on the instance.
|
ossOutputDelivery | The OSS configuration for command output delivery. |
ossOutputUri | The URI of the OSS file to which the command output is delivered. |
status | The delivery status.
|
statusCode | The delivery status code. This parameter is returned only when the status is `Failed`. |
errorCode | The error code for the delivery failure. This parameter is returned only when the status is `Failed`. Possible values:
|
errorInfo | The error details for the delivery failure. This parameter is returned only when the status is `Failed`. |
Recommendations for production environments
Idempotence: Event systems may deliver the same event multiple times due to network issues or retries. Your processing logic must be idempotent. This means processing the same event multiple times produces the same result as processing it once. You can use the
idordata.bizEventIdfrom the event as a unique identifier. Before processing an event, check if this ID has already been processed.Retry and dead-letter queue: When you configure an EventBridge event target, we strongly recommend that you configure a Retry Policy and a Dead-Letter Queue. If the processing function temporarily fails, EventBridge automatically retries. If the retry attempts fail, the event is sent to a dead-letter queue, such as an MNS queue. You can then manually investigate and recover the event to prevent data loss.
Monitoring and alerting: Monitor the event processing function itself. Monitor its execution success rate, duration, and error logs, and set up alerts. This lets you intervene immediately if the processing logic consistently fails.
FAQ
Why do I not receive event notifications after I subscribe to Cloud Assistant events using EventBridge?
Check the prerequisites: Confirm that the Cloud Assistant Agent version meets the requirements.
Check the EventBridge rule:
Log on to the EventBridge console. Confirm that the Event Pattern of the rule is correct. The
sourcemust beacs.ecs, and thetypemust be the correct event type.Use the Event Pattern Debugging feature to test whether the rule matches a real event JSON sample.
Check the health of the event target:
On the event rule details page in the EventBridge console, view the invocation records and error logs of the event target.
Confirm that the target service, such as Function Compute or a webhook, is running normally and is reachable over the network.