Alibaba Cloud Resource Management service is a collection of enterprise IT governance products and services, including Resource Directory, Resource Group, Resource Sharing, and Tags. Enterprises can use Resource Directory to build organizational relationships in the cloud, manage cloud resources hierarchically using Resource Groups and Tags, and share cloud resources among enterprise members using Resource Sharing. It is a good practice to manage resources' lifecycle and cost allocation through Resource Management service.
Resource Directory and Resource Group Management
Resource Directory supports the quicker construction of a resource relationship directory structure based on enterprise business or ecological environment, and it can distribute multiple accounts of an enterprise to corresponding positions in this directory structure, forming a multi-level relationship between resources. Enterprises can rely on the established organizational relationships to centrally manage resources and meet the needs of resource control in finance, security, auditing, and compliance.
A best practice is to set up a dedicated financial account from an enterprise perspective to review and manage the cost of cloud resources. The enterprise should pay for and manage all accounts under the organization or unit to achieve unified financial management capabilities at the enterprise level. Enterprises can achieve sub-account management based on resource folders, accounts, or resource groups.
The multi-account solution provided by Resource Directory, the tagged resource solution under a single account, and the resource group management solution can all meet the requirements of departmental sub-account billing management and solve the problem of financial cost allocation. Enterprises should consider resource management directory schemes based on security and auditing requirements.
Tags Management
Tags are mainly used in the following scenarios:
Use tags to find resources: Tags are used to mark resources, and you can quickly find resources that are bound to specific tags through the Tag console or API. For more information, see Use tags to query resources.
Use tags for cost allocation: Plan labels for resources based on organization or business dimensions (such as region, department, environment, or project), then use Alibaba Cloud's cost analysis and cost allocation features to manage cost allocation.
Use tags for automated operation and maintenance: Bind different tags to different environments (such as production and testing), operating systems (such as Windows Server and Linux), or client platforms (such as iOS and Android). Then create templates in orchestration services for automated maintenance operations in batch through executing these templates. For more information, see Tag-based Automated Operation and Maintenance Overview.
Resource Instances Management
Resource Management provides enterprises with a unified view of resources across accounts, products, and regions, as well as the ability to search for resources. Enterprises can view the global resource list to have a clear idea of the cloud resources. Resources can be searched based on filtering conditions such as region, product, resource type, resource ID, resource name, and resource tag. For search results, a one-click jump to the corresponding cloud service console is supported for further operations, making it more convenient to find and manage cloud resources.
Computing resources are usually divided into permanent resources and elastic resources. For routine business, permanent resources can be used to carry such business, which usually refers to enterprise internal management platforms, online businesses, and persistent operations. The capacity planning for this part usually needs to be estimated based on the water level of the actual business scenarios. Using CloudMonitor to monitor the running status of production system resources can reflect the true utilization of resources. By building a cloud resource monitoring system and continuously monitoring the various indicators corresponding to the system and resources, the resource capacity can be optimized.
The unexpected peak traffic and temporary tasks can be supplemented and provided by elastic resources. Use Auto Scaling? to use resources on-demand, automatically release resources, and not worry about the timely release of redundant resources causing cost waste. Elastic scaling can adjust computing capacity in a timely manner, improve resource utilization, and reduce the ownership cost of resources. In addition, enterprises do not need to invest a large amount of manpower to adjust computing resources, which saves labor and time costs. Auto Scaling? can dynamically allocate resources based on business needs. For predictable changes in cloud resources, such as predictable business peak hours, timed automatic scaling based on time can be used to provide the correct amount of resources in a timely manner. Also, based on workload analysis, use elastic resource group to configure planned scaling up and down.
It is also recommended to programmatically change the number of cloud resources in the architecture dynamically using Alibaba Cloud API and automation capabilities. It is a good practice to automatically increase the number of resources during peak demand periods to keep the business running smoothly, and to reduce capacity during periods of reduced demand to reduce costs. OOS is a cloud-based automated operation and maintenance service provided by Alibaba Cloud, which can automate the management and execution of tasks. Enterprises can define execute tasks, execution order, execution inputs and outputs through templates, and then complete automatic operations of tasks through executing these templates. For example, if a business system has specific traffic peak periods every day, and a large number of instances are needed during this period, OOS can be used to achieve ECS scheduled startup and shutdown to save costs.
Database instances can also be adapted to changes in business peak values through elastic settings. Upgrade instance specifications during business peak periods to adapt to business peaks and ensure the stability of online businesses. Reduce instance resources during periods of reduced business load to save costs. You can use Database Autonomy Service(DAS) automatic scaling and rollback features to easily optimize the use of database instance resources. For example, by setting the CPU average utilization threshold of a database instance, the instance is automatically upgraded when the threshold is reached. At the same time, when the CPU average utilization rate reaches the rollback threshold, it triggers an automatic rollback of the cluster specifications until it rolls back to the original specifications.
Reasonable use of discount resources and discount plans can reduce the cost of resource instances. Discount resources refer to storage capacity unit packages, storage resource packages, PolarDB, and other resource packages. After purchasing these resources, they belong to the user's assets and can be used to offset the pay-as-you-go usage of specified resources, thereby achieving cost savings. Enterprises can use the Resource Management Tool to manage deduction resource instances by viewing the summary view and usage view of deduction resources to allocate deduction quotas reasonably. It should be noted that different discount resources have different capacity types, and the rules for deducting resources may vary depending on the capacity type.
Resource Quotas Management
Alibaba Cloud optimizes global resource allocation through quantified limits on the number of resources and operations, such as API throttling limits. The maximum value of cloud resources that users can use or the maximum number of operations they can perform is the resource quota limit. From the perspective of cloud resource operation and maintenance, enterprises can query the quota limits for each Alibaba Cloud product through the Quota Center, and adjust quotas online according to business needs. Through the Quota Center, quota warnings can be set to reduce the impact of insufficient resources on business. From the perspective of resource supply, Alibaba Cloud can provide a certain level of instance resource supply certainty based on user rights. Enterprises can use the quota management service of the Alibaba Cloud steward service to view and increase instance quotas.
Resources Reduce and Stop
Reduce idle resources automatically and timely by resource lifecycle planning, and promptly stop idle resources. Tags and groups can be used to mark the lifecycle of resources during resource creation. For example, adding the tags "machines used in the testing phase" or "instances created for a 3-month task expiration" can help understand the usage of resources and take appropriate measures.
To reduce resources according to business needs, if the architecture of an enterprise system fully utilizes the architecture and capabilities of Auto Scaling, resources will automatically scale down according to the predefined elasticity rules when the business capacity is reduced.
For resources that do not have automatic elastic capabilities, when business volume decreases or the service is stopped, resources should be reduced and stopped in a timely manner. One example is to reduce the number of VPN client connections and lower the ECS and RDS instance specifications after completing the entire research and development process.
Resource water level can be viewed in resource monitoring tools, and workload throughput monitoring or alarms can be implemented. When workload throughput decreases, take timely actions to avoid possible resource waste. Enterprises can use the OOS for automated stoppage of these machines.
It is recommended to use Serverless computing resources such as Function Compute to build serverless and event-driven architectures on Alibaba Cloud. Enterprises do not need to manage infrastructure such as servers, they only need to write code and upload it, and Function Compute will prepare computing resources and run the code in an elastic and reliable manner. Serverless services provide automatically optimized resource utilization and automatic shutdown functions (scaling down and scaling up). Through serverless applications, resource utilization can be automatically optimized without prepayment for excessive provisioning.
Resources Release
Timely release of resources that are no longer needed is a common practice in resource management to save resource costs. This is not only for the purpose of saving cloud resource costs but also considering the management costs and security risks involved in retaining unused resources.
For example, if a temporary Elastic IP (EIP) is requested for the convenience of the development process and it is no longer needed, the EIP should be released in a timely manner.
For example, during the process of system migration and cutover, temporary business links may be established to help complete the cutover process and improve the rollback plan. After completing the system migration and cutover, these resources need to be released in a timely manner.
It is recommended that enterprises use the pay-as-you-go billing model for temporary resources during resource planning. Releasing these resources will not incur any costs. If an enterprise needs to release a pay-as-you-go resource halfway through a billing cycle and needs to confirm that releasing the resource will not affect the business, the enterprise can convert the resource to a pay-as-you-go resource. After confirming that releasing the pay-as-you-go resource has no impact on the business, the resource can be released.
Managing Data Transmission and Traffic Manage the location of ongoing data transmission, the cost of transmission, and related business goals. Monitor data transmission bandwidth and quality.
Idle resources on data transmission links that are no longer needed should be released in a timely manner. For example, during the process of migrating data to the cloud, temporary data links are often established. This may also involve gateway-type products. After the data migration is completed, the temporary data links need to be removed and the resources in the links need to be released.
Data transmission management can be combined with the enterprise's security design scheme. Implementing a security scheme can reduce unnecessary and dangerous data transmission. For data transmission calculated based on traffic, a good security scheme can not only resist transmission security and data security risks but also reduce unnecessary transmission costs. By enabling Cloud Firewall and setting up Protection Whitelist for inbound and outbound traffic of Elastic IP, unnecessary traffic and costs can be reduced. By enabling access logs for Cloud Firewall, you can view client access conditions and take timely rate limiting measures for clients with excessive access beyond reasonable usage. More security products can be found in the Security Pillar of the Security Protection section.
Storage Resource Management
Adopt a storage and computing separation architecture design to manage computing resources and storage resources separately. Manage the storage scheme of resources according to different storage product types. For example, Object Storage Service (OSS) provides storage planning based on standard storage capacity, infrequent access storage capacity, archival storage capacity, and cold archival storage capacity. Define data retention policies and lifecycle policies to perform automatic storage class migration and deletion, which will reduce overall storage costs throughout the lifecycle.
Resource Utilization Tracking
Enterprises need to track the resource utilization of each resource in workloads and adjust resources and instance specifications based on utilization changes. Operations and maintenance personnel need to monitor not only high resource utilization and high workload but also low resource utilization and low workload. For elastic business scenarios and uncertain business scenarios, it is a good practice to start with lower resource configurations and then expand resources as needed.
Enterprises can use the Cloud Monitor service provided by Alibaba Cloud to monitor the monitoring indicators of various cloud resources in real time and understand the usage of cloud resources. Generate alarms based on baseline thresholds for each indicator. Users can adjust alarm thresholds based on changes in business traffic to manage the actual business water level and avoid excessive responses.
Resource Billing Types Management
By managing the billing types of resources, users can optimize resource usage and cost structure. Refer to the Billing Optimization module in the Cost Optimization stage for optimizing resource billing methods.