A job is the basic computing unit of Elastic High Performance Computing (E-HPC). A job consists of a shell script and executable files. Jobs are run in a sequence that is determined by the specified queues and scheduler. In the E-HPC console, you can submit a job, stop a job, or view the status of a job. This topic describes how to use the E-HPC console to submit a job.
Prerequisites
The cluster and cluster nodes are in the Running state.
A user is created. For more information, see Create a user.
Job files are ready to be imported. E-HPC allows you to import job files by using one of the following methods:
Before you submit a job, log on to the cluster and import job files by using remote transmission solutions, such as rsync and the secure copy protocol (SCP).
When you submit a job, import the job files stored in an Object Storage Service (OSS) bucket.
When you submit a job, import the job files stored in your local directory or select newly created job files.
Procedure
Log on to the E-HPC console.
In the upper-left corner of the top navigation bar, select a region.
In the left-side navigation pane, choose Job and Performance Management > Job.
On the Job page, select a cluster from the Cluster drop-down list.
Click the Submit Job tab.
On the Submit Job tab, configure the required parameters. The following table describes key parameters.
Parameter
Description
Job Template
The configured template based on which a job is submitted. For more information, see Manage a job template.
Job Name
The name of the job. If you need to automatically download and decompress job files, name the job files after the job.
Command Line
The job execution command that you want to submit to the scheduler. You can enter a command or the relative path of the script file, for example, /home/test/job.pbs. This parameter is differently set in the following scenarios:
If the script file is executable, enter its relative path, for example,
./job.pbs
.If the script file is not executable, enter the execution command, for example,
/opt/mpi/bin/mpirun /home/test/job.pbs
. If your scheduler is PBS, add twohyphen (-)
before the command, for example,--/opt/mpi/bin/mpirun /home/test/job.pbs
.
Queue
If you added compute nodes to a queue when you created the cluster, submit the job to the queue. Otherwise, the job fails to be run. If you did not add compute nodes to a queue, the job is submitted to the default queue of the scheduler. You must select a queue in which the compute nodes reside. Otherwise, the job fails.
Task Quantity
The number of compute nodes that are used to run the job.
Number of Tasks
The number of tasks used by each compute node to run the job, that is, the number of processes.
Maximum Memory
The maximum memory that can be used when a compute node runs the job. If you do not specify this parameter, the memory is unlimited.
Maximum Run Time
The maximum running time of the job. If the actual running time exceeds the maximum running time, the job fails. If you do not specify this parameter, the running time is unlimited.
Thread Quantity
The number of threads that are used by a task. If you do not specify this parameter, the number of threads is 1.
GPU Quantity
The number of GPUs that are used when a compute node runs the job. If you specify this parameter, make sure that the compute node is a GPU-accelerated instance.
Priority
The priority of the job. Valid values: 0 to 9. A greater value indicates a higher priority. If you specify that jobs are scheduled by job priority when you set the cluster scheduling policy, jobs with a higher priority are scheduled and run first.
You can set a high priority for the jobs that you want to run first.
Enable Job Array
Specifies whether to enable the job array feature of the scheduler. A job array is a collection of similar independent jobs. You can set a job array to customize a job execution rule.
Format: X-Y:Z. X is the minimum index value. Y is the maximum index value. Z is the step size. For example, 2-7:2 indicates that three jobs need to be run and their index values are 2, 4, and 6. Default value of Z: 1.
Post-Processing Command
The command that is used to perform subsequent operations on the running results of the job, for example, packaging or uploading of the generated job data to an OSS bucket.
Stdout Redirect Path
The output file path of stderr and stdout redirected by using a Linux shell. The path contains the output file name.
stdout: standard output
stderr: standard error
Cluster users must have the write permissions on the path. By default, output files are generated based on the scheduler settings.
Stderr Redirect Path
Variables
The runtime variables passed to the job. They can be accessed by using environment variables in the executable file.
Upload the job files to the cluster.
Use the job files that are stored in an OSS bucket
E-HPC allows you to import job files from an OSS bucket before you submit a job. You can also specify job files that are stored in an OSS bucket when you submit a job in the E-HPC console. For more information, see Import job files from an OSS bucket to a cluster. To specify job files that are stored in an OSS bucket when you submit a job in the E-HPC console, perform the following steps:
On the Use OSS Job file tab, click Select File. In the Select File dialog box, select job files and click OK.
If you want to specify ZIP files, TAR files, or GZIP job files, you must turn on Decompression and specify a command to decompress them.
NoteAfter you select job files from an OSS bucket, a folder that has the same name as the job files (for example, JobName) is automatically created in the /home/user directory. Then, the job files are downloaded and decompressed (if necessary) to the /home/user/JobName directory.
Edit job files
Click the Edit Job Files tab.
On the Edit Job Files tab, click Cluster File Browser. In the dialog box that appears, enter the cluster username and password to log on to the cluster by using Workbench. You can create, edit, or delete job files based on your needs.
Click Submit Job in the upper-right corner of the Submit Job tab. In the dialog box that appears, enter the cluster username and password. The job is submitted to the cluster. Then, E-HPC runs the job.
Results
After you submit a job, you can view it on the Job page.
Find the job and click Details in the Actions column. In the Job Details panel, you can view the job details, including the job name, job ID, start time, the time at which the job is last updated, and job running information.