If your application needs to handle large workloads, you must ensure the stability of job scheduling and the timeliness of key jobs. This topic describes how to manage resources and job priorities based on applications.
Background information
You can use a third-party resource management system such as Mesos or Yarn to manage CPU and memory resources. Then, you can use an agent to connect your workers to SchedulerX. However, SchedulerX is a job scheduling platform. SchedulerX cannot manage CPU and memory resources or manage these resources through a third-party resource management system. SchedulerX can manage only the number of job instances and the priorities of jobs.
Scenarios
Resource and job priority management based on applications are suitable for scheduling jobs that involve large-scale workloads or a large amount of data.
For example, an application that runs on a data platform may need to process thousands of reports. The application may be overwhelmed and malfunction without resource management. In addition, you must ensure the timeliness of important reports and generate these reports before a specific time. Therefore, you must prioritize the jobs that are used to generate these reports. SchedulerX allows you to manage resources and job priorities based on applications.
Manage resources
SchedulerX allows you to manage the number of job instances for an application. For example, you turned on the traffic throttling switch for an application when you created the application and set the number of concurrent job instances to 1. Then, you created Job A, Job B, and Job C for the application and configured the system to trigger each job only once. When the system is running Job A, Job B and Job C are waiting in the queue. The system does not drop Job B and Job C.
Manage job priorities
Job priorities take effect only on jobs that belong to the same application. For jobs that belong to the same application and are triggered at the same time, the job with the highest priority is triggered first.
NoteIf multiple job instances are created for an application and jobs with different priorities are scheduled to different job instances, the system may trigger a job with a lower priority. SchedulerX uses preemptible queues to avoid this issue and ensure that jobs in a queue with a higher priority are triggered first. For more information, see Preemptible queues.
Manage the job instances of an application
Log on to the SchedulerX console.
In the top navigation bar, select a region.
In the left-side navigation pane, click Applications.
On the Applications page, select the namespace that you want to manage and click Create application.
In the Create application panel, click Advanced Configuration, turn on Flow Control, and then specify Number of concurrent task instances.
Number of concurrent task instances specifies the size of the job queue for an application. If the number of concurrent job instances exceeds the queue size, the system adds the excessive job instances to the queue.
Parameter
Description
Default value
Application Name
The name of the application that you want to create.
N/A
Application ID
The Application ID parameter specifies the group ID that is used to connect the application to SchedulerX. The application ID must be unique in the namespace. Otherwise, the system fails to create the application. You can also use the value of Application Name as the application ID.
N/A
Description
The description of the application.
N/A
app type
general app: Select this option if you do not want to deploy the application in a Kubernetes cluster or do not require Kubernetes jobs.
k8s App: Select this option if you want to deploy the application in a Kubernetes environment and require Kubernetes jobs.
general app
Release
Select a version as required.
Professional edition
load5
The value cannot be greater than the number of CPU cores available on the worker where the agent is deployed.
0
Memory usage
If the average memory usage within the previous 5 minutes exceeds the threshold that is specified by this parameter, the worker is considered busy.
90%
Disk Usage
If the disk usage exceeds the threshold that is specified by this parameter, the worker is considered busy.
95%
Whether to trigger a busy machine
Select whether to trigger jobs on busy workers.
Enabled
Advanced Configuration
Maximum number of tasks
The maximum number of jobs that are supported by the application.
1000
Automatic expansion
Specifies whether to enable auto scale-out. If you enable this feature, you must configure the Number of global tasks parameter.
Disabled
Flow Control
Specifies whether to enable traffic throttling. If you enable this feature, you must configure the Number of concurrent task instances parameter.
Disabled
Manage the priorities of the jobs for an application
Log on to the SchedulerX console.
In the top navigation bar, select a region.
In the left-side navigation pane, click Jobs.
On the Jobs page, select a namespace and click Create task.
On the Basic configuration wizard page of the Create task panel, specify Priority.
For more information about other parameters and the subsequent steps, see Create a job.
Parameter
Description
Task name
The name of the job that you want to create.
Description
The description of the job. We recommend that you use simple descriptions to make job management easier.
Application ID
The group to which the job belongs. Select a group from the drop-down list.
Task type
The programming language that you want to use to create the job. Valid values include java, shell, python, go, http, node.js, xxljob, and dataworks. When you select shell, python, or go, an editor is displayed. You can enter a script in the editor.
Execution mode
The mode in which the job is executed. Valid values:
Stand-alone operation: The job is executed on a random worker.
Broadcast run: The job is concurrently executed on all workers, and the system waits until all workers complete the job.
Visual MapReduce: a map model. No more than 300 tasks are allowed in the task list.
In Professional Edition, the maximum number of tasks is limited to 1,000, and tasks can be queried by keyword.
MapReduce (Memory Grid/grid computing): a map model. Task execution data is stored in memory at a high speed. The maximum number of tasks is limited to 50,000, and no task list is provided.
MapReduce: a map model. Task execution data is stored in disk files with a large throughput. The maximum number of tasks is limited to 1,000,000, and no task list is provided.
Shard run: This mode is similar to the elastic-job model. Shards are evenly distributed to run on multiple agents based on the specified sharding settings. This execution mode supports jobs that use different programming languages.
NoteThe advanced settings of a job vary based on the execution mode of the job.
Priority
The priority of the job. The high-priority jobs are triggered first.
Task parameters
An arbitrary string that can be obtained from the context when SchedulerX runs the job.
Advanced Configuration
Task failure retry count
Default value: 0.
Task failure retry interval
Default value: 30. Unit: seconds.
Task concurrency
The number of instances that are running the same job at the same time.
Cleaning strategy
Cleaning strategy for task execution history
Retained Number
Number of retention records for task history execution records
Preemptible queues
If multiple job instances are created for an application and jobs with different priorities are scheduled to different job instances, the system may trigger a job with a lower priority. SchedulerX uses preemptible queues to avoid this issue and ensure that jobs in a queue with a higher priority are triggered first.
Create a sample application, turn on Flow Control, and then set Number of concurrent task instances to 1. For more information, see Manage the priorities of the jobs for an application.
Create three jobs for the application and set the priorities of the jobs to high, medium, and low. For more information, see Manage the priorities of the jobs for an application.
On the Jobs page of the application, click Run once in the Operation column for the following jobs in sequence: Medium-priority job, Low-priority job, and High-priority job.
View the execution results.
When the system triggers the job with the medium priority, the queue is empty. Therefore, the system executes the job with the medium priority.
After the system completes the job with the medium priority, the job with the high priority preempts the job with the low priority.