This topic provides answers to some frequently asked questions about network connectivity.
How does Realtime Compute for Apache Flink access the Internet?
How does Realtime Compute for Apache Flink access a service across VPCs?
How does Realtime Compute for Apache Flink access the Internet?
Background information
By default, Realtime Compute for Apache Flink cannot access the Internet. Therefore, Alibaba Cloud provides NAT gateways to enable communications between virtual private clouds (VPCs) and the Internet. This way, users of Realtime Compute for Apache Flink can access the Internet by using user-defined extensions (UDXs) or DataStream code.
Solution
Create a NAT gateway in the VPC. Then, create an SNAT entry to bind the vSwitch that is associated with Realtime Compute for Apache Flink to an elastic IP address (EIP). This way, the service can access the Internet by using the EIP. To access the Internet by using the EIP, perform the following steps:
Create a NAT gateway. For more information, see the "Create a NAT gateway" section of the Purchase an Internet NAT gateway topic.
Create an SNAT entry and bind the vSwitch that is associated with Realtime Compute for Apache Flink to an EIP. For more information, see the "Create an SNAT entry" section of the Create and manage SNAT entries topic.
How does Realtime Compute for Apache Flink access a service across VPCs?
You can use one of the following methods to allow Realtime Compute for Apache Flink to access a service across VPCs:
Submit a ticket to establish connections between VPCs by using services such as Express Connect. When you create the ticket, select Virtual Private Cloud (VPC) as the product name. You are charged when you use this method.
Connect network instances to a Cloud Enterprise Network (CEN) instance to enable network communication among the network instances. For more information, see Overview.
Use VPN Gateway to establish a VPN connection between VPCs. For more information, see Establish IPsec-VPN connections between two VPCs.
Release the resource that resides in a different VPC from Realtime Compute for Apache Flink. Then, purchase the same resource that resides in the same VPC as Realtime Compute for Apache Flink.
Release the Realtime Compute for Apache Flink workspace. Then, purchase another Realtime Compute for Apache Flink workspace that is in the same VPC as the service that you want Realtime Compute for Apache Flink to access.
Enable Internet access for Realtime Compute for Apache Flink. This way, Realtime Compute for Apache Flink can access other services over the Internet. By default, Realtime Compute for Apache Flink cannot access the Internet. For more information about how to allow Realtime Compute for Apache Flink to access the Internet, see the How does Realtime Compute for Apache Flink access the Internet? section of this topic.
NoteThe Internet has a longer latency than internal networks. If you have high performance requirements, we recommend that you do not use this method.
How do I configure a whitelist?
In most cases, the upstream and downstream storage systems that are supported by Realtime Compute for Apache Flink do not allow access from external systems. Therefore, you must add the CIDR block of the vSwitch of Realtime Compute for Apache Flink to the whitelist of the storage system that Realtime Compute for Apache Flink needs to access. To do so, perform the following steps:
Log on to the Realtime Compute for Apache Flink console.
Find the workspace that you want to manage and choose
in the Actions column.In the Workspace Details message, view the CIDR block of the vSwitch of the Realtime Compute for Apache Flink workspace.
Add the CIDR block of the vSwitch of Realtime Compute for Apache Flink to the whitelist of the storage system that Realtime Compute for Apache Flink needs to access.
For example, you must configure a whitelist for an ApsaraDB RDS for MySQL database. For more information, see Configure an IP address whitelist.
NoteIf you add a vSwitch later, you must also add the CIDR block of the new vSwitch to the whitelist of the storage system that Realtime Compute for Apache Flink needs to access.
If your vSwitch is not in the same zone as the upstream and downstream storage systems, the network can be connected after you add the CIDR block of the vSwitch to the whitelist.
How do I troubleshoot network issues?
A Realtime Compute for Apache Flink workspace is deployed in a VPC. After you purchase a Realtime Compute for Apache Flink workspace, you cannot change the VPC that you selected. If the source or the sink is not in the same VPC as the Realtime Compute for Apache Flink workspace, the source or the sink is disconnected from the Realtime Compute for Apache Flink workspace and data cannot be read from the source or written to the sink. If data cannot be read from the source or written to the sink, perform the following steps to check whether a network issue exists:
Check the network connectivity between the upstream storage service and the Realtime Compute for Apache Flink workspace.
Realtime Compute for Apache Flink can access only storage services that are deployed in the same VPC and the same region as Realtime Compute for Apache Flink. If you want to access storage resources across VPCs or access Realtime Compute for Apache Flink over the Internet, use the following methods:
To access storage resources across VPCs, you can use one of the methods that are described in the How does Realtime Compute for Apache Flink access a service across VPCs? section of this topic.
To access Realtime Compute for Apache Flink over the Internet, you can use NAT gateways that are provided by Alibaba Cloud to enable communications between VPCs and the Internet. For more information, see the How does Realtime Compute for Apache Flink access the Internet? section of this topic.
Check whether the CIDR block of the vSwitch to which the Realtime Compute for Apache Flink workspace belongs is added to the whitelists of the upstream storage services such as ApsaraMQ for Kafka and Elasticsearch.
If the CIDR block is not added to the whitelists of the upstream storage services, perform the following steps:
Log on to the Realtime Compute for Apache Flink console. Find the workspace that you want to manage and choose More > Workspace Details in the Actions column. In the Workspace Details message, view the CIDR block of the vSwitch of the Realtime Compute for Apache Flink workspace.
Add the CIDR block to the whitelists of the upstream storage services. For more information about how to add the CIDR block to the whitelists of the upstream storage services, see the topics that are linked in the prerequisites of the related DDL documentation, such as the topic that is linked in the prerequisites of Create a Message Queue for Apache Kafka source table.
If a network timeout error persists, the network issue may be caused by a connection timeout. In this case, increase the value of the connect.timeout parameter in the WITH clause. The default value of this parameter is 30, in seconds.
How do I view the public bandwidth?
If the metric values of the deployment are normal and no backpressure exists in the deployment during data reading or writing over the Internet, you can view the public bandwidth to check whether a bottleneck issue occurs. To view the public bandwidth, perform the following steps:
Log on to the Realtime Compute for Apache Flink console. Find the desired workspace and choose More > Workspace Details in the Actions column. In the Workspace Details message, view the VPC ID.
Log on to the VPC console. In the left-side navigation pane, click VPC. On the VPC page, find the desired VPC and click its ID.
On the Resource Management tab of the details page of the VPC, click the value of Internet NAT Gateway in the Access to Internet section.
NoteIf the value of Internet NAT Gateway in the Access to Internet section is 0, you must create an Internet NAT gateway. For more information, see Create and manage an Internet NAT gateway.
On the Internet NAT Gateway page, click the ID of the Internet NAT gateway.
On the Associated EIP tab, click the ID of the EIP.
On the page that appears, click the Monitoring and O&M tab to view the public bandwidth.