This topic describes how to automatically remediate non-compliant resources across accounts in an enterprise. In this topic, the cross-account resource management capability of a resource directory and the account group feature of Cloud Config are used.
Prerequisites
A resource directory is enabled. For more information, see Enable a resource directory.
Function Compute is activated. For more information, see Step 1: Activate Function Compute.
ImportantWhen you use Function Compute functions, you are charged for the number of function calls, resource usage, and outbound Internet traffic. For more information, see Billing overview.
Background information
Cloud Config can detect non-compliant resources based on rules. You can configure custom remediation for the non-compliant resources. If an enterprise needs to remediate non-compliant resources across accounts, the enterprise can use the organizational structure of a resource directory to manage accounts and resources. For more information about resource directories, see Resource Directory overview. In this example, the ecs-instance-monitor-enabled rule is used to detect and automatically remediate non-compliant resources across accounts. Account A (ID: 100931896542****) is the management account or a delegated administrator account of a resource directory. Account B (ID: 178366182654****) and Account A are the members of the same resource directory. Account B has non-compliant resources. The following section describes how to log on to Cloud Config as Account A and detect and remediate non-compliant resources in Account B.
Step 1: Create a role for the management account and attach policies to the role
Log on to the Resource Access Management (RAM) console.
Create a RAM role.
In the left-side navigation pane, choose Identities > Roles.
On the Roles page, click Create Role. On the Create Role page, configure the parameters.
Set the Select Trusted Entity parameter to Alibaba Cloud Account and click Next.
Enter a RAM role name in the RAM Role Name field. In this example, enter
ConfigCustomRemediationRole
. Set the Select Trusted Alibaba Cloud Account parameter to Current Alibaba Cloud Account.Click OK.
Click Close.
Create a permissions policy.
In the left-side navigation pane, choose Permissions > Policies.
Click Create Policy. The Create Policy page appears.
On the JSON tab, enter the following policy script.
// The entity that assumes this role has the permissions to install the CloudMonitor agent. { "Version": "1", "Statement": [ { "Effect": "Allow", "Action": "cms:InstallMonitoringAgent", "Resource": "*" }, { "Action": "sts:AssumeRole", "Effect": "Allow", "Resource": "*" } ] }
Click Next to edit policy information. On the page appears, enter a policy name. In this example, enter
ConfigCustomRemediationPolicy
.Click Save.
Grant permissions to the role.
In the left-side navigation pane, choose Permissions > Grants.
On the Permission page, click Grant Permission. In the Grant Permission panel, grant permissions to the role that you created.
Set the Resource Scope parameter to Account.
In the Principal section, enter
ConfigCustomRemediationRole
in the search box and click the displayed role to select the role.Select Custom Policy from the drop-list in the Policy section. Enter
ConfigCustomRemediationPolicy
in the search box and click the displayed policy to select the policy.Click Grant permissions.
Attach a trust policy to the role.
In the left-side navigation pane, choose Identities > Roles.
On the Roles page, search for the
ConfigCustomRemediationRole
role and click the name of the role to go to the role configuration page.On the Trust Policy tab, click Edit Trust Policy. On the page that appears, enter the following policy script.
// Allow the Function Compute service to assume the role. { "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::100931896542****:root" ] } }, { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": [ "fc.aliyuncs.com" ] } } ], "Version": "1" }
Click Save trust policy document.
Step 2: Create a role for the member and attach policies to the role
Create a RAM role and grant permissions to the role.
For more information, see Substeps 1 to 4 in Step 1.
Attach a trust policy to the role.
For more information, see Substep 5 in Step 1. Replace the policy script with the following sample script.
// Allow Account A (ID: 100931896542****) to assume the role. { "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::178366182654****:root" ] } }, { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "RAM": [ "acs:ram::100931896542****:role/configcustomremediationrole" ] } } ], "Version": "1" }
Step 3: Create a custom remediation function
Log on to the Function Compute console.
Creates a service.
In the left-side navigation pane, click Services & Functions.
On the Services page, click Create Service. The Create Service panel appears.
Enter a service name in the Name field. In this example, enter
ConfigRemediationService
.Click Show Advanced Options and set the Service Role parameter to
ConfigCustomRemediationRole
.Click OK.
Create a custom remediation function.
In the left-side navigation pane of the service details page, click Functions.
Click Create Function. The Create Function page appears.
Select Use Built-in Runtime.
In the Basic Settings section, enter the function name. In this example, the Function Name parameter is set to
ConfigRemediationFunction
. Set the Handler Type parameter to Event Handler.In the Code section, set the Runtime parameter to Python 3.9 and the Code Upload Method parameter to Use Sample Code.
Click Create to go to the function details page.
On the Code tab, enter the following sample code for the resource remediation function.
#!/usr/bin/env python # -*- encoding: utf-8 -*- import json from aliyunsdkcore.client import AcsClient from aliyunsdkcore.acs_exception.exceptions import ClientException from aliyunsdkcore.acs_exception.exceptions import ServerException from aliyunsdkcore.request import CommonRequest from aliyunsdkcore.auth.credentials import StsTokenCredential from aliyunsdksts.request.v20150401.AssumeRoleRequest import AssumeRoleRequest import logging logger = logging.getLogger() # The sample code is used to remediate non-compliant resources based on the ecs-instance-monitor-enabled rule. You can modify the remediation logic based on your business requirements. def handler(event, context): get_resources_non_compliant(event, context) def get_resources_non_compliant(event, context): resources = parse_json(event) for resource in resources: remediation(resource, context) def parse_json(content): """ Parse string to json object :param content: json string content :return: Json object """ try: return json.loads(content) except Exception as e: logger.error('Parse content:{} to json error:{}.'.format(content, e)) return None def remediation(resource, context): logger.info(resource) region_id = resource['regionId'] account_id = resource['accountId'] resource_id = resource['resourceId'] resource_type = resource['resourceType'] config_rule_id = resource['configRuleId'] if resource_type == 'ACS::ECS::Instance': logger.info("process account_id: {}, resource_id: {}, config_rule_id: {}".format( account_id, resource_id, config_rule_id)) install_monitoring_agent(context, account_id, region_id, resource_id) def install_monitoring_agent(context, account_id, resource_region_id, resource_id): logger.info("start install agent {}: {}".format(resource_region_id, resource_id)) token = assume_role_and_get_token(context, account_id, resource_region_id) client = AcsClient(token['Credentials']['AccessKeyId'], token['Credentials']['AccessKeySecret'], region_id=resource_region_id) request = CommonRequest() request.set_accept_format('json') request.set_domain(f'metrics.{resource_region_id}.aliyuncs.com') request.set_method('POST') request.set_protocol_type('https') # https | http request.set_version('2019-01-01') request.set_action_name('InstallMonitoringAgent') request.add_query_param('InstanceIds.1', resource_id) request.add_query_param('Force', "true") request.add_query_param('SecurityToken', token['Credentials']['SecurityToken']) response = client.do_action_with_exception(request) logger.info(response) # Assume the role to obtain a temporary Security Token Service (STS) token. Replace the role name in the sample code with the actual role that you use. def assume_role_and_get_token(context, account_id, region_id): creds = context.credentials logger.info('assume_role_and_get_token begin.') credentials = StsTokenCredential(creds.access_key_id, creds.access_key_secret, creds.security_token) client = AcsClient(credential=credentials) request = AssumeRoleRequest() request.set_domain(f'sts-vpc.{region_id}.aliyuncs.com') request.set_accept_format('json') request.set_RoleArn(f'acs:ram::{account_id}:role/configcustomremediationrole') request.set_RoleSessionName("ConfigCustomRemediationRole") response = client.do_action_with_exception(request) logger.info('assume_role_and_get_token response : {}.'.format(response)) token = json.loads(response) logger.info('assume_role_and_get_token: {}, assume role: {}.'.format(context.credentials, token)) return token
Step 4: Create a rule and configure custom remediation
Log on to the Cloud Config console.
Create an account group and add Account A and Account B to the account group.
For more information, see Create an account group.
In the upper-left corner of the Cloud Config console, switch to the account group that you created in the previous step.
Create a rule. For more information, see Create a rule based on a managed rule.
In the Select Create Method step, select Based on managed rule, search for and select the ecs-instance-monitor-enabled rule, and then click Next.
In the Set Basic Properties step, configure the Rule Name, Risk Level, Trigger, and Description parameters and click Next.
In the Set Effective Scope step, retain the default settings and click Next.
In the Set Remediation step, turn on Set Remediation, set the Invoke Type parameter to Automatic Remediation, and then select the Function Compute function that you created in Step 3 in the Function ARN section. Click Submit.
NoteIf the custom remediation function is being tested, you can set the Invoke Type parameter to Manual Remediation to observe and debug the function. After the function passes the test, you can set the Invoke Type parameter to Automatic Remediation.
Step 5: Implement automatic remediation and verify the remediation result
On the Rules page, find the rule that you want to manage, and click Remediation Detail in the Remediation Template column.
On the Remediation Detail tab, click Perform Manual Remediation next to Remediation Detail.
In the Execution Result List section, you can view the remediation results. You can also view the reason why a resource fails to be remediated.
NoteOn the Remediation Detail tab, click the function ARN next to Remediation Template to go to the Code tab of the remediation function in the Function Compute console.