|
|
 |
Users are supposed to compile their codes and create submit scripts on the master machine, mphase, and
then submit the scripts to the queue system which will dispatch them to appropriate nodes for execution.
Maximum computation time is five days.
Basic commands to monitor and submit jobs under SGE on the cluster:
1. Show job/queue status - qstat
no arguments Show currently running/pending jobs
-f Show full listing of all queues
-j Shows detailed information on pending/running job
-u Shows current jobs by user
Each computational node is assigned a unique queue name.
2. Show job/host status - qhost
no arguments Show a table of all execution hosts and information about their configuration
-l attr=val Show only certain hosts
-j Shows detailed information on pending/running job
-q Shows detailed information on queues at each host
Submitting jobs on mphase.rutgers.edu using SGE
3. Submit scripts - qsub
Started with no arguments it accepts input from STDIN (^D to send submit input)
-cwd Run the job from the current working directory
-v Pass the variable VAR (-V passes all variables)
-o Redirect standard output (Default: Home directory)
Example:
qsub -cwd -v SOME_VAR -o /dev/null -e /dev/null myjob.sh
The submit parameters can be specified in the script, myjob.sh.
In such case, just run:
qsub myjob.sh
Note that qsub only accepts shell scripts, not executable files.
Also: man qsub for details
4. Submit executables - bsub
For the users who got used to LSF on RCG cluster, I wrote simple wrappers,
bsub, bjobs and bkill, to immitate such commands on LSF; using bsub, you can submit a
binary executable to SGE queue system.
Submitting jobs on mphase.rutgers.edu using SGE
Example:
bsub -q node05.q a.out
submits your a.out executable to queue node05.q, which is assigned to host node05.
bsub a.out
submits a.out to any available queue.
5. Status of your running job - bjobs
6. Deleting submitted job from the queue - qdel
Example:
qdel jobID
For running jobs, use the force flag, "-f".
Example:
qdel -f jobID
7. How to Kill your running or pending job (same as qdel) - bkill
Example:
bkill 666
kills job with JobID = 666
8.
Manual pages for some SGE commands
For any technical questions contact Admin at:
alexei@soemail.rutgers.edu
To the top
|
|
© 2003 All Rights Reserved.
Rutgers, The State University of New Jersey.
|
|