When DataWorks functional modules (such as Data Integration, DataService Studio, metadata acquisition, and DataAnalysis) need to access your data sources, some data sources restrict access through whitelist mechanisms. To ensure the normal operation of these functional modules, you need to add the outbound IP addresses or CIDR blocks of the corresponding modules to the whitelist of your data sources.
Background information
The network paths used by different DataWorks functional modules to access data sources vary. Therefore, when data sources enable whitelist control, specific authorization is required:
Resource group-related modules (such as Data Integration and DataService Studio): These modules rely on the network of resource groups to access data sources. You need to add the CIDR blocks of the vSwitches or public IP addresses of the resource groups to the whitelist of data sources to support related task execution.
Platform service modules (such as metadata acquisition and DataAnalysis): These modules use service nodes independently maintained by DataWorks to initiate access requests. These nodes are independent of the resource group network. Therefore, data sources need to add platform-preset dedicated IP CIDR blocks to ensure the whitelist covers all outbound nodes in the access link, avoiding functional abnormalities caused by missing whitelist authorization.
Prerequisites
The network between the data source and resource group is connected. For more information, see Network connectivity solutions.
Obtain whitelist
Data Integration resource group whitelist
Serverless resource group
Obtain internal network CIDR blocks of resource groups
Applicable to scenarios where data sources and DataWorks are connected through internal networks. Add the CIDR blocks of the vSwitches bound to the resource groups to the whitelist of your data sources.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Network Settings column of the target resource group to go to the VPC Binding page of the resource group.
View the corresponding VSwitch CIDR Block under Data Scheduling & Data Integration.
Add the queried vSwitch CIDR blocks to the whitelist of your data source.
Obtain public IP addresses of resource groups
Applicable to scenarios where data sources and DataWorks are connected through the internet. Add the EIP of the resource group to the whitelist of your data source.
Serverless resource groups do not have public network access capabilities by default. You need to configure a public NAT Gateway and EIP for the VPC bound to the resource group before it can access data sources through the internet.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Network Settings column of the target resource group to go to the VPC binding page of the resource group.
Find the bound VPC under Data Scheduling & Data Integration, click
after the VPC to go to the Basic Information page of the VPC.
Switch to the Resource Management tab, and in the Public Access Service section, click the number under Public NAT Gateway to go to the list page of public NAT gateways created for this VPC.
View the bound EIP on the public NAT gateway list page.
Add the queried EIP address to the whitelist of your data source.
Legacy exclusive resource group for Data Integration
Obtain internal network CIDR blocks of resource groups
Applicable to scenarios where data sources and DataWorks are connected through internal networks. Add the CIDR blocks of the vSwitches bound to the resource groups to the whitelist of your data sources.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Network Settings column of the target resource group to go to the VPC Binding page of the resource group.
Find the VPC bound to the resource group, and then view the corresponding VSwitch CIDR Block.
Add the queried vSwitch CIDR blocks to the whitelist of your data source.
Obtain public IP addresses of resource groups
Applicable to scenarios where data sources and DataWorks are connected through the internet. Add the EIP of the resource group to the whitelist of your data source.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Details column of the target resource group to go to the resource group details page.
Obtain the EIP address.
Add the queried EIP address to the whitelist of your data source.
Shared resource group for Data Integration
If you use the legacy shared resource group for Data Integration, you need to add the shared resource group for Data Integration whitelist to your data source.
DataService Studio resource group whitelist
Serverless resource group
Obtain internal network CIDR blocks of resource groups
Applicable to scenarios where data sources and DataWorks are connected through internal networks. Add the CIDR blocks of the vSwitches bound to the resource groups to the whitelist of your data sources.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Network Settings column of the target resource group to go to the VPC Binding page of the resource group.
View the corresponding VSwitch CIDR Block under DataService Studio.
NoteIf there is no bound VPC and vSwitch under DataService Studio, you can click Add Binding to manually bind one, and then obtain the vSwitch CIDR block.
Add the queried vSwitch CIDR blocks to the whitelist of your data source.
Obtain public IP addresses of resource groups
Applicable to scenarios where data sources and DataWorks are connected through the internet. Add the EIP of the resource group to the whitelist of your data source.
Serverless resource groups do not have public network access capabilities by default. You need to configure a public NAT Gateway and EIP for the VPC bound to the resource group before it can access data sources through the internet.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Network Settings column of the target resource group to go to the VPC binding page of the resource group.
Find the bound VPC under DataService Studio, click
after the VPC to go to the Basic Information page of the VPC.
Switch to the Resource Management tab, and in the Public Access Service section, click the number under Public NAT Gateway to go to the list page of public NAT gateways created for this VPC.
View the bound EIP on the public NAT gateway list page.
Add the queried EIP address to the whitelist of your data source.
Legacy exclusive resource group for DataService Studio
Obtain internal network CIDR blocks of resource groups
Applicable to scenarios where data sources and DataWorks are connected through internal networks. Add the CIDR blocks of the vSwitches bound to the resource groups to the whitelist of your data sources.
Go to the DataWorks resource group list page, switch to the region where the target resource group is located at the top, and then find the target resource group in the resource group list.
Click Actions in the Details column of the target resource group to go to the resource group details page.
Obtain the VSwitch bound to the resource group, then go to the Virtual Private Cloud console, search for the vSwitch, and obtain the IPv4 CIDR Block of the vSwitch.
Add the queried vSwitch CIDR blocks to the whitelist of your data source.
Shared resource group for DataService Studio
If you use the shared resource group for DataService Studio, you need to add the shared resource group for DataService Studio whitelist to your data source.
Metadata acquisition whitelist
When a data source for metadata acquisition has whitelist access control, you need to add the metadata acquisition whitelist to your data source.
DataAnalysis whitelist
If the target MaxCompute for DataAnalysis has enabled whitelist control, you need to add the DataAnalysis whitelist to MaxCompute.
Add whitelist
If your data source is an Alibaba Cloud product, refer to the whitelist configuration documentation as needed to add the obtained whitelist to your data source:
The following are only references for whitelist settings of some common Alibaba Cloud products. For data sources not listed, please refer to the official documentation.
If your data source is not an Alibaba Cloud product, please refer to the official documentation of the relevant product for whitelist configuration methods.
Configure a public network whitelist for OpenSearch Vector Search Edition | |
References
For common issues related to network connectivity, see Resource group operations and network connectivity.
For common issues related to adding whitelists, see FAQ about adding whitelists.