AnalyticDB for MySQL allows you to asynchronously submit an import job.
Scenarios
When you execute the INSERT INTO
or INSERT OVERWRITE SELECT
statement to import data, the statement is synchronously executed by default. If hundreds of gigabytes of data is imported, the connection between the client and the AnalyticDB for MySQL server may be closed. In this case, the data import job fails. When a large amount of data is involved, we recommend that you asynchronously submit import jobs.
Asynchronously submit a job
Syntax
SUBMIT JOB INSERT OVERWRITE table_name
select_statement;
Examples
SUBMIT JOB INSERT OVERWRITE test SELECT * FROM test_external_table;
After you execute the preceding statement, a job ID is returned. You can use the job ID to query the status of the asynchronous job.
+---------------------------------------+
| job_id |
+---------------------------------------+
| 2017112122202917203100908203303000715 |
Configure priority-based scheduling
AnalyticDB for MySQL V3.1.3.6 and later support priority-based scheduling. You can add the /*+async_job_priority=n*/
hint to the beginning of the job submission statement to specify the priority for an asynchronous job. The default priority value is 1. A greater value specifies a higher priority for scheduling the job.
/*+async_job_priority=10*/ SUBMIT JOB INSERT OVERWRITE test SELECT * FROM test_external_table;
Query the status of an asynchronous job
Syntax
SHOW JOB STATUS WHERE job='job_id';
Examples
SHOW JOB STATUS WHERE job='2017112122202917203100908203303000715';
The following information is returned:
+---------------------------------------+-------------+---------+----------+-----------------------+-----------------------+----------------------------------------------------------------+
| job_id | schema_name | status | fail_msg | create_time | update_time | definition |
+---------------------------------------+-------------+---------+----------+-----------------------+-----------------------+----------------------------------------------------------------+
| 2017112122202917203100908203303000715 | test | RUNNING | NULL | 2023-06-21 22:20:31.0 | 2023-06-21 22:20:40.0 | INSERT OVERWRITE test SELECT * FROM test_external_table |
Fields in the returned information vary based on cluster minor versions. We recommend that you parse asynchronous jobs based on the status field.
Job statuses and mappings
Status in Data Warehouse Edition | Status in Data Lakehouse Edition | Description |
INIT | SUBMITTING | The job is submitted. |
SUBMITTED | RUNNING | The job is running. |
RUNNING | ||
FINISH | SUCCEEDED | The job is successful. |
FAILED | FAILED | The job fails. |
None | CANCELLED | The job is canceled. |
The CANCELLED state is new and unavailable in Data Warehouse Edition (V3.0).
Terminate a job
Description
Unscheduled, failed, and successful jobs are removed from the queue.
If a running job is terminated, the data imported by the job may be rolled back. We recommend that you manually delete the residual data.
Syntax
CANCEL JOB 'job_id';
Examples
CANCEL JOB '2017112122202917203100908203303000715';