To query and analyze data in an efficient manner, you can execute SQL statements. For example, you can execute SQL statements to query data of data sources such as MaxCompute, E-MapReduce (EMR) Hive, and Hologres. This topic describes how to execute SQL statements to query data of data sources.
Data source types that support queries based on SQL statements
The following types of data sources support queries based on SQL statements: MaxCompute, Hologres, EMR Hive, EMR Spark SQL, EMR Impala, EMR Presto, EMR Trino, CDH Hive, StarRocks, ClickHouse, AnalyticDB for MySQL 3.0, AnalyticDB for PostgreSQL, and MySQL.
Prerequisites
A data source is added to a DataWorks workspace. For more information, see Add and manage data sources.
You are granted query permissions on specific data sources in the DataAnalysis service.
Your account is added as a member of the workspace, and the Data Analyst, Model Designer, Develop, O&M, Workspace Administrator, or Workspace Owner role is assigned to the member. For more information, see Add workspace members and assign roles to them.
NoteYou can use the SQL query feature to query only the data of the data sources in a workspace on which you have query permissions. Before you can query data by using the feature, you must contact Workspace Administrator to add your account to the workspace as a member and assign one of the preceding roles to your account.
Precautions
Data source permissions
For a workspace in standard mode, you can be authorized to use only the data sources in the development environment. For a workspace in basic mode, you can be authorized by the administrator to use any data sources.
NoteFor information about the data sources on which the administrator can grant permissions to your account, see supported data source types.
For information about how to request permissions on data sources, see Security Center.
For information about workspace modes, see Differences between workspaces in basic mode and workspaces in standard mode.
Data source query
When you use the SQL query feature to query data of the MaxCompute data source that is automatically generated when you associate a MaxCompute compute engine with your workspace, your logon account is automatically used to access data of the MaxCompute data source. If you want to access data of tables in the production environment, you must specify the name of the project to which the tables belong in SQL commands. If you do not have permissions to query the data of the tables in the production environment, go to Security Center to request the permissions.
Whitelist-based access control for data sources
If whitelist-based access control is enabled for a MaxCompute project, errors may occur when you use features provided by DataAnalysis, such as fee calculation, data download, dimension table usage, and data upload, to perform operations on tables in the MaxCompute project. To ensure that DataAnalysis can access the MaxCompute project, you must add the IP address information about DataAnalysis in a specific region to the IP address whitelist of the MaxCompute project in advance. For more information, see Appendix: IP address whitelist for DataAnalysis.
The upgraded SQL query feature is available. If the new GUI of the SQL query feature displayed to you is inconsistent with the GUI described in this topic or specific functionalities are missing, you can refer to the FAQ about the upgraded SQL query feature section in this topic.
Features
Feature | Description |
You can add the recommended directories or add tables in the data sources on which you have query permissions as a directory. After you add tables as a directory, you can view the SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table. | |
You can create an SQL query file, write SQL query statements, and then commit the SQL query statements to a specific data source for execution. | |
After you write SQL query statements, you can execute the SQL query statements to query the data of the data source. | |
After the SQL query statements are executed, you can view the run logs, running results, and SQL statements that correspond to the query results in the Result section. You can also view the query results in line charts, column charts, bar charts, pie charts, and cross tables. | |
You can manage SQL query files in a centralized manner. |
Go to the SQL Query page
You can use one of the following methods to go to the SQL Query page:
On the homepage of DataAnalysis, click the SQL Query card in the Shortcuts section to go to the SQL Query page.
In the left-side navigation pane of the DataAnalysis page, click SQL Query to go to the SQL Query page.
Add a directory
By default, the SQL Query page displays only the My Files and Other People's Files directories. You can click the icon to the right of the search box to add directories based on your business requirements. After you add a directory, you can view SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table.
You can add different types of directories on the following tabs:
Recommended Directory: This tab displays the recommended directories that you can add, such as My Favorites, My MaxCompute tables, and Public Tables.
DataMap - Metadata: You can add a compute engine project that is associated with a workspace as a directory. For more information, see Create and manage workspaces and Add and manage data sources.
DataMap - Data Album: On this tab, you can add the data albums that you manage, create, and follow as directories. For more information, see Table management from the business perspective: Data albums.
Others People's Files: stores the SQL query files that are saved by other members. The files can be viewed by all members of the current workspace.
You can add up to eight directories.
Create an SQL query file
You can create SQL query files based on the workspace mode and the compute engine type.
Create an SQL query file.
You can create an SQL query file by using one of the following methods:
Manually create an SQL query file.
In the left-side navigation pane of the SQL Query page, move the pointer over the icon to the right of My Files and select Create File.
The first time you go to the SQL Query page or if no SQL query file is displayed on the page, click Create SQL Query in the right-side area to create an SQL query file.
If the welcome page or an existing SQL query file is opened, click the plus icon to the right of the name of the existing file to create a new SQL query file.
Select a data source for the SQL query file.
Select the desired workspace, the compute engine type, and the data source whose data you want to query.
NoteYou can select only the data sources on which you have query permissions. If no data source of a specific compute engine type is available, contact Workspace Administrator to add your account to the workspace as a member and assign the Data Analyst, Model Developer, Development, O&M, or Workspace Administrator role to your account.
You can modify the resource group that is used to run tasks on the System Management page.
Execute SQL query statements
In the SQL editor, write the code that you want to use to query the data of the data source. Then, click the Run icon in the toolbar. You can view the query results in the Result section.
The SQL query feature allows you to perform the following operations:
Execute SQL statements by clicking the Run icon marked with 1 or the Run icon marked with 2 in the preceding figure. After the SQL statements are executed, you can view the execution results. You can execute SQL statements by using one of the following methods:
Method 1: Select multiple SQL statements and click the icon to execute all the selected statements.
Method 2: Click the Run icon to the left of an SQL statement to execute the statement.
View the code version by clicking the icon marked with 3 in the preceding figure and choosing
. Then, you can view the differences between the automatically saved code and the manually saved code, and select one to save.Estimate costs. When you execute a single SQL statement, the Estimated Costs dialog box appears and displays the estimated costs. You can also click the icon marked with 4 in the preceding figure, and select Estimate Costs from the More drop-down list to view the estimated costs before you execute all SQL statements that you write.
NoteYou can estimate the costs of executing SQL statements only if the compute engine type that you select is MaxCompute.
Format code by clicking the icon marked with 5 in the preceding figure. You can click the Format icon to standardize the SQL statements that you write.
Perform operations on the query results
After the SQL statements are executed, you can perform the operations that are described in the following table on the query results.
Operation | Screenshot | Description |
View the query results | You can click Export to export the query results by using one of the following methods:
| |
Visualize the query results | You can view the query results in different types of charts, such as line charts, column charts, bar charts, and pie charts. You can also save the query results as data cards to build your own visual knowledge base. You can select a data card to quickly create data reports and create personalized products with ease based on your business requirements. | |
View the SQL statements that correspond to the query result | You can view the SQL statements that correspond to the query result. You can also copy and paste the SQL statements to the code editor. |
Manage SQL query files
On the SQL Query page, you can perform the following operations on SQL query files.
Item | Description |
Search for a table | In the area that is marked with 1, you can enter the table name in the search box and click the icon to search for a table. |
Search code | In the area that is marked with 1, you can click the icon and enter a keyword to search for code. |
View the running history of SQL statements | In the area that is marked with 1, you can click the icon to view the running history of SQL statements. |
Add a directory | In the area that is marked with 1, you can click the icon to add a directory. You can add the recommended directories or add tables in the data sources on which you have query permissions as a directory. After you add tables as a directory, you can view the SQL files or data tables in the directory. You can also view the schema of a table and perform simple operations to generate SQL statements for the table. |
My Files | In the area that is marked with 2, the My Files directory stores the SQL query files that you saved. You can right-click a file and select Share to share the file with a specific workspace. : indicates that the file is being shared. All members of the workspace to which the file is shared can view the file in the Other People's Files directory. |
Other People's Files | In the area that is marked with 2, the Other People's Files directory stores the SQL query files that are saved by other members. The files can be viewed by all members of the current workspace. |
My MaxCompute tables | In the area that is marked with 2, you can click the icon to add the My MaxCompute tables directory. The My MaxCompute tables directory stores the MaxCompute tables within your account. Tables in the development and production environments are included. |
My Favorites | In the area that is marked with 2, you can click the icon to add the My Favorites directory. The My Favorites directory stores the tables that are added to favorites. To add a table to your favorites, right-click the table and select Add to Favorites. |
Public Tables | In the area that is marked with 2, you can click the icon to add the Public Tables directory. The Public Tables directory stores the tables that are provided by DataAnalysis. You can use the tables to experience the SQL query feature without the need to apply for required permissions. |
Custom settings
You can click the icon in the lower-left corner of the SQL Query page to go to the Settings tab and customize the settings:
Theme: In this section, you can specify the theme for the SQL query.
Settings for Code Editor: In this section, you can specify the code style and code hints, and specify whether to allow the system to enter the prompted suggestions on code writing when you press Enter.
Settings for SQL-related Operations: In this section, you can specify the default mode in which each SQL statement is executed.