Monitor and maintain Internet NAT gateways - NAT Gateway

You can use CloudMonitor to monitor Internet NAT gateways, collect information about inbound and outbound traffic, You can use CloudMonitor to monitor Internet NAT gateways, collect information about inbound and outbound traffic, collect data on various metrics in real time, and generate time sequence curves in the NAT Gateway console. This allows you to troubleshoot issues.

增强型NAT网关

Metric monitoring data

Internet NAT gateway monitoring

Log on to the NAT Gateway console.
In the top navigation bar, select the region where you want to create the NAT gateway.

On the Internet NAT Gateway page, find the Internet NAT gateway and click the icon in the Monitor column.

The following table describes the metrics.

Category	Metric	Description
Session Monitor	SessionActiveConnection/ErrorPortAllocationCount(count)	SessionActiveConnection: the maximum number of concurrent TCP and UDP connections per minute that are supported by the NAT gateway. ErrorPortAllocationCount: the number of times that the NAT gateway fails to allocate a TCP or UDP port when the number of concurrent connections to the destination address exceeds the upper limit. Note Each elastic IP address (EIP) provides a limited number of ports for SNAT. If the number of user sessions that access the same destination address is excessively large and the number of EIPs specified in SNAT entries is insufficient, port allocation may fail. If the number of port allocation failures keeps increasing, we recommend that you specify more EIPs in SNAT entries. For more information, see Create an SNAT IP address pool.
	SessionLimitDropConnection(countS)	The rate of concurrent connections that are dropped due to the limit of concurrent connections to the NAT gateway.
	SessionNewConnection/SessionNewLimitDropConnection(countS)	SessionNewConnection: the number of new TCP and UDP connections that are established to the NAT gateway per second. SessionNewLimitDropConnection: the number of new connections that are dropped per second due to the limit of new connections that can be established to the NAT gateway per second.
	SessionNewConnectionWater/SessionNewLimitDropConnectionWater(%)	SessionNewConnectionWater: the percentage of established connections to the upper limit of connections. SessionNewLimitDropConnectionWater: the percentage of established new connections to the upper limit of new connections. Note Each NAT gateway supports 100,000 new connections per second and 2,000,000 concurrent connections per minute. If your service triggers a scale-up, the adjustment typically takes effect within 10 minutes.
Incoming Flow Statistics	BWRateToInside	The amount of inbound traffic per second, including the following two metrics: BWRateInFromOutside: the amount of traffic per second from the Internet to the NAT gateway. BWRateOutToInside: the amount of traffic per second from the NAT gateway to the VPC.
	BytesToInside(bytes)	The total amount of inbound traffic, including the following two metrics: BytesInFromOutside: the amount of traffic from the Internet to the NAT gateway. BytesOutToInside: the amount of traffic from the NAT gateway to the VPC.
	PacketsPerSecond(countS)	The number of inbound packets per second, including the following two metrics: PPSRateInFromOutside: the number of packets per second from the Internet to the NAT gateway. PPSRateOutToInside: the number of packets per second from the NAT gateway to the VPC.
	Packets(count)	The total number of inbound packets, including the following two metrics: PacketsInFromOutside: the number of packets from the Internet to the NAT gateway. PacketsOutToInside: the number of packets from the NAT gateway to the VPC.
Outlet Flow Statistics	BWRateToOutside(bps)	The amount of outbound traffic per second, including the following two metrics: BWRateOutToOutside: the amount of traffic per second from the NAT gateway to the Internet. BWRateInFromInside: the amount of traffic per second from the VPC to the NAT gateway.
	BytesToOutside(bytes)	The total amount of outbound traffic, including the following two metrics: BytesOutToOutside: the amount of traffic from the NAT gateway to the Internet. BytesInFromInside: the amount of traffic from the VPC to the NAT gateway.
	PacketsPerSecond(countS)	The number of outbound packets per second, including the following two metrics: PPSRateOutToOutside: the number of packets per second from the NAT gateway to the Internet. PPSRateInFromInside: the number of packets per second from the VPC to the NAT gateway.
	Packets(count)	The number of outbound packets, including the following two metrics: PacketsOutToOutside: the number of packets from the NAT gateway to the Internet. PacketsInFromInside: the number of packets from the VPC to the NAT gateway.

View traffic monitoring data of Internet NAT gateways

If your Elastic Compute Service (ECS) instances access the Internet through SNAT, abnormal traffic on some ECS instances can affect other ECS instances. After you enable the traffic monitoring feature, you can view the traffic monitoring data of ECS instances that access the Internet through SNAT. This allows you to find the ECS instances with the highest data transfer. You can manage data transfer rules of these ECS instances to identify and handle issues and improve service stability. Before you view traffic monitoring data, make sure that the following requirements are met:

An Internet NAT gateway is created. For more information, see Create and manage Internet NAT gateways.
A ticket is submitted to apply for the required permissions to view traffic monitoring data.

Log on to the NAT Gateway console.
In the top navigation bar, select the region where you want to create the NAT gateway.
On the Internet NAT Gateway page, find the NAT gateway that you want to manage and click Manage in the Actions column.
On the Basic Information page, click the Monitor tab.

Click the Traffic Details tab to view the traffic monitoring data.

You can view traffic monitoring data at a time granularity level of minutes. For example, if you set the time to 18:30:00 on July 18, 2024, you can view traffic monitoring data between 18:30:00 on July 18, 2024 to 18:31:00 on July 18, 2024.
Note
- After you enable traffic monitoring, you must wait about 15 minutes before you can view the traffic monitoring data.
- The monitoring data may not be up-to-date and has a delay of 3 to 5 minutes. For example, at 18:30 on July 18, 2024, you can view the traffic monitoring data before 18:25 on July 18, 2024 but cannot view the traffic monitoring data after 18:25 on July 18, 2024.
- The traffic monitoring feature can display the top 100 ECS instances with the largest amount of data transfer.

The following table describes the metrics of the traffic monitoring feature.

Metric	Unit	Description
Inbound Bandwidth	bps Note The unit in the console shall prevail.	The bandwidth that is used to access ECS instances over the Internet.
Outbound Bandwidth	bps Note The unit in the console shall prevail.	The bandwidth that is used to access the Internet from ECS instances.
Inbound Packets Per Second	Packets/second	The number of packets from the Internet to ECS instances per second.
Outbound Packets Per Second	Packets/second	The number of packets from an ECS instance to the Internet per second.
Concurrent Connections	Connections	The number of concurrent connections established by an ECS instance that accesses the Internet through the NAT gateway.
New Connections per Second	Packets/second	The number of new connections established per second by an ECS instance that accesses the Internet through the NAT gateway.

View monitoring data of EIPs that are associated with Internet NAT gateways

Log on to the NAT Gateway console.
In the top navigation bar, select the region where you want to create the NAT gateway.
On the Internet NAT Gateway page, find the NAT gateway that you want to manage and click Manage in the Actions column.

Click the Monitoring and Logging tab, and then click the EIP Monitoring Associated with NAT Service tab to view the monitoring metrics.

The following table describes the metrics.

Metric	Description
InboundBandwidth	The bandwidth used to access ECS instances over the Internet. Unit: bit/s.
OutboundBandwidth	The bandwidth used to access the Internet from ECS instances. Unit: bit/s.
InboundPacketRate	The number of packets from the Internet to ECS instances. Unit: packets per second (PPS).
OutboundPacketRate	The number of packets from ECS instances to the Internet. Unit: PPS.
OutRatelimitDropSpeed	The number of packets dropped per second due to throttling. Unit: PPS.
InRatelimitDropSpeed	The number of packets dropped per second due to throttling. Unit: PPS.
InternetInRatePercentage	The bandwidth usage of inbound traffic from the Internet to ECS instances.
InternetOutRatePercentage	The bandwidth usage of outbound traffic from ECS instances to the Internet.

Create a threshold-triggered alert rule

You can create alert rules to monitor the usage and status of Internet NAT gateways in real time. This ensures the stability of your workloads.

Log on to the CloudMonitor console.
In the left-side navigation pane, choose Alerts > Alert Rules.
On the Alert Rules page, click Create Alert Rule.

In the Create Alert Rule panel, set the following parameters and click Confirm:

This topic describes only the key parameters. For more information about the other parameters, see Create an alert rule.

Parameter	Description
Product	The name of the service that you want to monitor by using CloudMonitor. Example: enhanced_nat_gateway.
Resource Range	The resources to which the alert rule is applied. Valid values: All Resources: The alert rule is applied to all your instances of the specified type. For example, if you set the Resource Range parameter to All Resources and the alert threshold for CPU utilization to 80% for ApsaraDB for MongoDB, CloudMonitor sends an alert notification when the CPU utilization of an ApsaraDB for MongoDB instance exceeds 80%. If you set the Resource Range parameter to All Resources, the alert rule is applied to up to 1,000 instances. If the specified service has more than 1,000 instances, you may not receive an alert notification when the value of the specified metric reaches the threshold. We recommend that you add resources to application groups before you create alert rules. Instances: The alert rule is applied to a specific instance. For example, if you set the Resource Range parameter to Instances and the alert threshold of CPU utilization to 80% for an ECS instance, CloudMonitor sends an alert notification when the CPU utilization of the ECS instance exceeds 80%.
Rule Name	Enter a name for the alert rule.
Rule Description	The content of the alert rule. This parameter specifies the conditions that are used to trigger the alert rule. For example, the condition states that if the average CPU utilization in 5 minutes is greater than or equal to 90% for three consecutive cycles, an alert is triggered. CloudMonitor checks whether the condition is met for only three times every 5 minutes.
Mute For	The interval at which CloudMonitor resends an alert notification if the issue that triggers the alert persists.
Effective date	Set the period during which the alert rule is effective. The system monitors the metrics and generates alerts only during the effective period.
Alert Contact Group	The contact group to which alert notifications are sent.
Advanced Settings
HTTP Callback	The webhook URL that can be accessed over the Internet. CloudMonitor sends a POST request to push an alert notification to the webhook URL that you specify. Only HTTP requests are supported.
Method to handle alerts when no monitoring data is found	Specify the method that is used to handle alerts if no monitoring data exists. Valid values: Do not do anything (default) Send alert notifications Treated as normal
Tag	Specify tags for the alert rule. A tag consists of a tag key and a tag value.

References

PutResourceMetricRule: sets a threshold-triggered alert rule for the metrics of a single resource.
CreateMetricRuleResources: creates a resource associated with an alert rule.