All Products
Search
Document Center

DataWorks:Isolate a data source in the development and production environments

Last Updated:Feb 26, 2025

DataWorks provides the data source isolation feature for workspaces in standard mode. This way, data of the development environment can be isolated from data of the production environment.

Background information

In a workspace in standard mode, a data source has two sets of settings: one in the development environment and the other in the production environment. You can separately configure the data source in the development environment and production environment based on the two databases or data warehouses that are specified for the data source in the workspace in standard mode. When you run a synchronization task, the environment in which the task is run determines the database of the data source that is accessed by the synchronization task. This way, data of the development environment is isolated from data of the production environment. For more information about workspaces in standard mode, see Differences between workspaces in basic mode and workspaces in standard mode.

  • In a workspace in standard mode, Operation Center in the development environment and DataStudio access the data source that is configured in the development environment by default.

  • When you run a task in Operation Center in the production environment, Operation Center in the production environment accesses the data source that is configured in the production environment by default.

示例

Note
  • You can configure different databases, usernames, and passwords for the same data source in the development and production environments. In this case, the synchronization task in which the data source is used may be successfully run on the DataStudio page but fail to be run in the production environment due to different configurations of the data source in the development and production environments. You must make sure that the databases or data warehouses of the data source in the development and production environments are configured based on your business requirements. If you successfully run a task on the DataStudio page but fail to run the task in the production environment, or the amount of data differs in the development and production environments, you can troubleshoot the issue by comparing the success logs of the task in the development environment with the error logs of the task in the production environment.

  • Tasks are deployed to the production environment for running. If the configurations of the data source in the development and production environments differ, you must make sure that network connections are established between the resource group you want to use and the data source in different environments.

The data source isolation feature has the following impacts on workspaces:

  • Only workspaces in standard mode support the data source isolation feature. You can specify different databases or data warehouses for the same data source in a workspace in standard mode when you add the data source to the workspace.

    Note

    A workspace in basic mode provides only one environment. Therefore, data cannot be isolated by environment. For more information about workspace modes, see Scenario: Upgrade a workspace from the basic mode to the standard mode.

  • After you upgrade a workspace from the basic mode to the standard mode, the original data source is configured in the development and production environments.

Procedure

  1. Go to the Data Sources page.

    1. Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose More > Management Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Management Center.

    2. In the left-side navigation pane of the SettingCenter page, click Data Sources.

  2. On the Data Sources page, perform the following operations.

    Operation

    Description

    Batch add data sources

    You can add multiple MySQL, SQL Server, PolarDB, or Oracle data sources at a time. Other data sources do not support batch addition.

    DataWorks provides templates that you can use to add multiple data sources at a time. You can download a template, configure the fields in the template, and then upload the template. The progress and results are displayed in the field of the Batch Add Data Sources dialog box. Fields in the template: DataSourceType, DataSourceName, description, Environment classification (0dev, 1prod), JDBC URL, username, and password.

    Note

    The name of the data source in the development environment must be the same as the name of the data source in the production environment.

    Add a single data source

    • Data sources in the development environment: You can select such a data source when you create a synchronization task and run the task in the development environment. You cannot commit the task to the production environment for running.

    • Data sources in the production environment: You can use such a data source only in the production environment. You cannot select such a data source when you create a synchronization task.

    View data source information in different environments

    • If the current workspace is in basic mode, only data source information in the production environment is displayed.

    • If the current workspace is in standard mode, data source information in both the development and production environments is displayed. If no data source information is displayed in an environment, you can click Create to configure information in the environment.

    Perform management operations

    Modify and Delete: If a data source is configured in the related environment, Modify and Delete are displayed in the Operation column.

    • Before you delete a data source from the development and production environments, you must check whether the data source is used by a synchronization task in the production environment. The deletion cannot be rolled back. After the data source is deleted, you cannot select it when you configure a synchronization task in the development environment.

      If a synchronization task in the production environment uses the data source, the synchronization task cannot be run after the data source is deleted. Before you delete the data source, we recommend that you configure another data source for the synchronization task and deploy the synchronization task to the production environment or that you undeploy and delete the synchronization task.

    • Before you delete a data source from the development environment, you must check whether the data source is used by a synchronization task in the production environment. The deletion cannot be rolled back. After the data source is deleted, you cannot select it when you configure a synchronization task in the development environment.

      If a synchronization task in the production environment uses the data source, the metadata information cannot be obtained when you modify the synchronization task after the data source is deleted, but the synchronization task can be normally run.

    • Before you delete a data source from the production environment, you must check whether the data source is used by a synchronization task in the production environment. If you select the data source when you configure a synchronization task in the development environment, you cannot commit the synchronization task to the production environment after the data source is deleted.

      If a synchronization task in the production environment uses the data source, the synchronization task cannot be run after the data source is deleted.

    Perform batch operations

    You can select the data sources that you want to delete and click Batch Delete in the lower part of the Data Sources page to delete the data sources at a time.