Maui Scheduler and Job submission/monitoring commands
- If you have any experience with PBS/Torque or SGE, you should recognize most of the commands for Maui uses PBS-like commands
- The compute nodes are deployed with ROCKS. Ganglia is used for V4 cluster status display.
How the v4 Cluster is Scheduled
The v4 cluster is configured to have three queues you can submit to: v4, v4-64g, and v4dev. The showqlease command shows how many nodes are in each. The scheduler implements backfill without fairshare or limits on the number of jobs that can be submitted. The result is roughly first-come, first-serve, with the exception that backfill can allow smaller jobs to sneak past multi-node jobs when it won't affect the large jobs' start time.
The v4 cluster also schedules per-node, so that there are multiple cores available to each job. This is not true of several private clusters.
In general, no project id has priority ahead of any other on the queues. Within each queue, some groups have purchased rights to exclusive access to certain nodes. We call these leased nodes. If a job's project id is in a lease, then that job will preferentially run within the leased node but spill over into the other, non-leased nodes (called "standard" by the showqlease command). Those jobs still do not have higher priority, but they have access to leased nodes that other jobs do not.
nsub - Job submission. This command submits a job to the scheduler for scheduling. If the job is submitted successfully, a unique id identifying that job is returned. nsub is a CAC custom wrapper around the Torque qsub command that you must use to submit jobs. nsub understands all qsub parameters but adds restrictions to ensure that you have properly specified a valid account, node and walltime specifications. Specification of job properties can only be done using #PBS directives. Windows and Linux handle nsub directives differently.
$ nsub jobscript.sh (submit the jobscript.sh job, which contains PBS directives, see the examples)
$ showq (display everything) $ showq -r (show running jobs) $ showq -u foo42 (shows foo42's jobs)
checkjob - Display job information. Users are only able to execute this command on their own jobs. 'checkjob' provides information about the job id and its state.
$ checkjob -A 42 (get dense key-value pair information on job 42) $ checkjob -v 42 (get verbose information on job 42 including information as to why a job may not be running)
canceljob - Cancel Job. This command can be used (amongst other things) to cancel a running job. You can only cancel your own jobs.
$ canceljob 42 (cancel job 42, if you own it)
showbf - Show available resources. This command can be used to help understand what resources are available on the system for immediate use. This can help you to understand when your jobs will run. You simply specify the desired resources for your job and a list of available resources will be returned to you. You must use -A.
$ showbf -u foo42 -A (displays all available resources on all available resource managers and queues)
showqlease - CAC-provided command to show summary information on the availability of queues and leases.
showbalance - CAC-provided command to show remaining compute time balance.
Full listing of Maui Commands'