From Cornell CAC Documentation
- 10 compute nodes [compute-0-[0-9], one head node [classe.cac.cornell.edu] running Red Hat 5.1 linux.
- Monitor it with Ganglia
- Runs the maui and Torque resource manager.
- Submit HELP requests: help OR by sending email to: email@example.com
The scheduler has one queue, default. It is set to allow jobs to share nodes and allocate resources per-core. Jobs spread across the nodes first, so the first 10 jobs will run on 10 nodes and, after the next 10 jobs, each node will have 2 jobs running.
- qsub - submit jobs
- showq - see running jobs
- canceljob - cancel jobs
How is My Job?
Given your job ID, try qstat -n or checkjob <jobid> to see what node your job is on. Then go to Ganglia and find that node. Click on the node in Ganglia to get more detailed information.