You incur costs when you use MaxCompute for development. If your daily spending is usually stable but suddenly doubles during a period, analyze your costs. Identify which projects and jobs caused the spike. Then optimize or adjust those jobs to reduce costs. This topic explains how to use billing details and the MaxCompute global meta service (Information Schema) to troubleshoot sudden spikes in MaxCompute pay-as-you-go costs.
Background information
This topic uses the following overall approach to troubleshoot sudden spikes in MaxCompute pay-as-you-go costs:
Use the Fee Hub to confirm the date when costs exceeded expectations. Then identify which billing item in which project incurred high costs that day.
Analyze the root cause of the cost increase:
If computing costs exceeded expectations, use the TASKS_HISTORY view in Information Schema to count job volume and list top-cost jobs.
If storage fees exceeded expectations, download usage details to analyze changes in storage fees.
If public network downstream traffic fees exceeded expectations, use the TUNNELS_HISTORY view in Information Schema to track changes in public network downstream traffic fees.
Step 1: Enable the Information Schema service
Starting March 1, 2024, MaxCompute no longer automatically installs the project-level Information Schema package for new projects. New projects do not have the project-level Information Schema package by default. If your business requires metadata queries, use the tenant-level Information Schema instead. It provides more complete information. For instructions on using tenant-level Information Schema, see Tenant-level Information Schema.
For existing MaxCompute projects, before using the Information Schema service, you need to:
A project owner or a Resource Access Management (RAM) user with the Super_Administrator role. This grants access to project metadata. For more information about assigning management roles to users, see Assign roles to users.
You can install the package in two ways. For more information about Information Schema features and limits, see Information Schema overview.
Method 1: Log on to the MaxCompute client. Run this command:
install package Information_Schema.systables;Method 2: Log on to the DataWorks console. Go to the Ad Hoc Query interface. For more information about ad hoc queries, see Run SQL statements using ad hoc queries (optional). Run this command:
install package Information_Schema.systables;
To analyze metadata from multiple MaxCompute projects, install the Information Schema package for each project. Then insert the backup metadata from each project into the same table for centralized analysis.
We recommend using tenant-level Information Schema. It lets you query metadata and usage history for all your projects.
(Optional) Step 2: Grant permissions to users other than the project owner
The Information Schema views contain all user data at the project level. By default, the project owner can view this data. If other users or roles in the project need to view it, you must grant them permissions. For more information, see Access resources across projects based on packages.
The syntax for granting permissions is as follows.
grant <actions> on package Information_Schema.systables to user <user_name>;
grant <actions> on package Information_Schema.systables to role <role_name>;actions: The permission to grant. The value must be Read.
user_name: The Alibaba Cloud account or RAM user that has been added to the project.
You can execute the
list users;command in the MaxCompute client to obtain user accounts.role_name: The role that has been added to the project.
You can execute the
list roles;command in the MaxCompute client to obtain role names.
The following example shows how to grant permissions.
grant read on package Information_Schema.systables to user RAM$Bob@aliyun.com:user01;(Optional) Step 3: Download and back up metadata
MaxCompute retains job history for completed jobs for the last 14 days. If you often query data older than 14 days, back up metadata to your project on a schedule. Skip this step if you only run ad hoc queries for job history within the last 14 days.
Log on to the MaxCompute client. Run this command to create a metadata backup table.
In the DataWorks Data Studio interface, create an ODPS SQL node named history_backup. Configure timed scheduling to write data to the tasks_history and tunnels_history backup tables. Click the
icon in the upper-left corner to save.Here is an example of the ODPS SQL node command:
-- project_name is the MaxCompute project name. USE <project_name>; -- Back up tasks_history. INSERT INTO TABLE <project_name>.tasks_history SELECT task_catalog,task_schema ,task_name,task_type STRING,inst_id,`status`,owner_id,owner_name,result ,start_time,end_time,input_records,output_records,input_bytes,output_bytes ,input_tables,output_tables,operation_text,signature,complexity,cost_cpu,cost_mem,settings,ds FROM information_schema.tasks_history WHERE ds ='${datetime1}'; -- Back up tunnels_history. INSERT INTO TABLE <project_name>.tunnels_history SELECT tunnel_catalog,tunnel_schema,session_id,operate_type,tunnel_type,request_id,object_name ,partition_spec,data_size,block_id,offset,length,owner_id,owner_name,start_time,end_time ,client_ip,user_agent,object_type,columns,ds FROM information_schema.tunnels_history WHERE ds ='${datetime1}';The
${datetime1}parameter is a DataWorks scheduling parameter. In the ODPS SQL node, click Scheduling Configuration on the right. In the Basic Properties section, set Parameters todatetime1=${yyyymmdd}.NoteTo analyze metadata from multiple MaxCompute projects, create multiple ODPS SQL nodes. Write metadata from all projects to the same backup table.
Step 4: Analyze high-cost projects and billing items in the Fee Hub
Log on to the Fee Hub. Use the following methods to analyze high-cost projects and billing items. For more information, see View billing details.
In the navigation pane on the left, choose . Set Statistic Item to Instance and Statistical Period to Day.
Method 1: Find the project (instance ID) with high costs. If one project clearly exceeds expectations, analyze that project.
Method 2: Find the high-cost billing item. Search for a specific project by Resource Instance Name/ID. Then analyze the high-cost billing items for that project.
In the navigation pane on the left, choose Cost Analysis. On the Cost Analysis page, you can also find billing items with rising costs.
Step 5: Identify the root cause of the cost increase
Analyze the high-cost projects and billing items to find the root cause.
SQL job computing costs exceed expectations
Overall SQL job costs are high (including foreign table jobs). This may be due to a single abnormal job, repeated execution, or improper scheduling configuration.
Query the instance ID (inst_id) of the high-cost job to view its execution details.
Log on to the MaxCompute client. Use the
usecommand to switch to the high-cost project identified in Step 4. Query job costs using TASKS_HISTORY. Here is an example:-- Enable the ODPS 2.0 data type. For details, see ODPS 2.0 data types. SET odps.sql.decimal.odps2=true; SELECT inst_id ---instance ID ,input_bytes ---data input volume ,complexity ,CAST(input_bytes/1024/1024/1024 * complexity * 0.3 AS DECIMAL(18,5) ) cost_sum ,GET_JSON_OBJECT(settings, "$.SKYNET_ID") SKYNET_ID ---DataWorks scheduling job ID FROM information_schema.tasks_history -- To query metadata older than 14 days, query the backup table created in Step 3. Table name: <project_name>.tasks_history WHERE task_type = 'SQL' OR task_type = 'SQLRT' AND ds = 'date partition to query' ORDER BY cost_sum DESC LIMIT 10000 ;NoteSQL computing cost = Input data volume × SQL complexity × Unit price (USD 0.0438 per GB).
In the example,
task_type = 'SQL'means an SQL job.task_type = 'SQLRT'means an SQL query acceleration job.
Check the
SKYNET_ID(DataWorks scheduling job ID) of the high-cost SQL job.If it has an ID, check the node execution details in DataWorks.
If it does not have an ID, the job was not triggered by a DataWorks scheduling node. Use the
inst_idto view execution details. Here is an example:SELECT operation_text FROM information_schema.tasks_history WHERE ds='<date partition of job execution>' AND inst_id='<inst_id>';
Find duplicate jobs that ran many times.
Log on to the MaxCompute client. Use the
usecommand to switch to the high-cost project identified in Step 4. Query duplicate jobs using TASKS_HISTORY. Here is an example:-- Analyze job growth trends. SELECT signature ,ds ,COUNT(*) AS tasknum FROM information_schema.tasks_history -- To query metadata older than 14 days, query the backup table created in Step 3. Table name: <project_name>.tasks_history where task_type = 'SQL' OR task_type = 'SQLRT' AND ds >= 'date partition to query' GROUP BY ds ,signature ORDER BY tasknum DESC LIMIT 10000 ; -- Confirm the abnormal signature and check recent execution of the corresponding SQL job SELECT * FROM information_schema.tasks_history -- To query metadata older than 14 days, query the backup table created in Step 3. Table name: <project_name>.tasks_history where signature = 'abnormal signature' AND ds >= 'date partition to query' ;
Spark computing costs exceed expectations
Spark jobs have high overall consumption. You can query the inst_id of jobs with abnormal consumption to view their specific execution information.
Log on to the MaxCompute client. Use the
usecommand to switch to the high-cost project identified in Step 4. Query job costs using TASKS_HISTORY. Here is an example:-- Enable the ODPS 2.0 data type. For details, see ODPS 2.0 data types. SET odps.sql.decimal.odps2=true; SELECT inst_id --instance ID ,cost_cpu --job CPU consumption (100 equals 1 core × second. Example: 10 cores running for 5 seconds gives a cost_cpu of 10 × 100 × 5 = 5000). ,CAST(cost_cpu/100/3600 * 0.36 AS DECIMAL(18,5) ) cost_sum FROM information_schema.tasks_history -- To query metadata older than 14 days, query the backup table created in Step 3. Table name: <project_name>.tasks_history WHERE task_type = 'CUPID' AND status='Terminated' AND ds = 'date partition to query' ORDER BY cost_sum DESC LIMIT 10000 ;NoteSpark job computing cost for the day = Total billable hours × Unit price (USD 0.1041 per hour per task).
task_type = 'CUPID'means a Spark job.
Use the
inst_idto view execution details. Here is an example:SELECT operation_text FROM information_schema.tasks_history WHERE ds='date partition of job execution' AND inst_id='<inst_id>';
MapReduce job costs exceed expectations
MapReduce jobs have high overall consumption. You can query the inst_id of jobs with abnormal consumption to view detailed execution information.
Log on to the MaxCompute client. Use the
usecommand to switch to the high-cost project identified in Step 4. Query job costs using TASKS_HISTORY. Here is an example:-- Enable the ODPS 2.0 data type. For details, see ODPS 2.0 data types. SET odps.sql.decimal.odps2=true; SELECT inst_id --instance ID ,cost_cpu --job CPU consumption (100 equals 1 core × second. Example: 10 cores running for 5 seconds gives a cost_cpu of 10 × 100 × 5 = 5000). ,CAST(cost_cpu/100/3600 * 0.36 AS DECIMAL(18,5) ) cost_sum FROM information_schema.tasks_history -- To query metadata older than 14 days, query the backup table created in Step 3. Table name: <project_name>.tasks_history WHERE task_type = 'LOT' AND status='Terminated' AND ds = 'date partition to query' ORDER BY cost_sum DESC LIMIT 10000 ;Use the
inst_idto view execution details. Here is an example:SELECT operation_text FROM information_schema.tasks_history WHERE ds='date partition of job execution' AND inst_id='<inst_id>';
Storage fees exceed expectations
Use the Usage Details to query storage fees.
Download usage details. For details, see Download usage details.
Upload usage details. For details, see Upload usage details data to MaxCompute.
Analyze usage details using SQL. For details, see Analyze MaxCompute billing usage details.
Public network downstream traffic fees exceed expectations
Overall public network downstream traffic fees are high. Use the following steps to find which time period and category had high download fees.
Log on to the MaxCompute client. Use the
usecommand to switch to the high-cost project identified in Step 4. Query download fees using TUNNELS_HISTORY. Here is an example:-- Enable the ODPS 2.0 data type. For details, see ODPS 2.0 data types. set odps.sql.decimal.odps2=true; SELECT ds ,operate_type ,SUM(CAST(data_size / 1024 / 1024 / 1024 * 0.8 AS DECIMAL(18,5))) download_fee FROM information_schema.tunnels_history WHERE operate_type = 'DOWNLOADLOG' OR operate_type = 'DOWNLOADINSTANCELOG' AND ds >= 'date partition to query' GROUP BY ds ,operate_type ORDER BY download_fee DESC ;NoteDownload fee = Downloaded data volume × Unit price (USD 0.1166 per GB).
You can also analyze download fee trends over time using the query results. You can also run the
tunnel show historycommand to view detailed history. For the full command reference, see Tunnel commands.
More information
For more articles about cost optimization, see Cost optimization overview.