General PBS information
To run proficently a job with the PBS some basic steps are to be performed:
- create a job script requesting the resources needed (CPU no., execution walltime) and run shell commands/scripts to prepare the execution (e.g. cd to a scratch directory)
- submit the script file to PBS
- monitor the job
Here below are listed the most commonly used PBS commands that can be included in a script if preceded by the #PBS keyword. Options marked in red are mandatory.
| option | description |
|---|---|
| #PBS -N MyJobName | Sets the job's name. The default is the script name |
| #PBS -l MyResources | Selects the requested resources, CPUs, nodes and walltime |
| #PBS -q MyQueue | Selects the queue |
| #PBS -o MyPath/My.out | Selects the path of the standard output. The default is where the job is submitted |
| #PBS -e MyPath/My.err | Selects the path of the standard error. The default is where the job is submitted |
| #PBS -j MyPath/My.errout | Selects the path of the merged standard output/error. The default is where the job is submitted |
| #PBS -m b | Sends mail to the user when the job begins |
| #PBS -m e | Sends mail to the user when the job ends |
| #PBS -m a | Sends mail to the user when job aborts (with an error) |
| #PBS -V | Exports all environment variables to the job |
Combining the options you can create a suitable PBS job file, e.g.:
#PBS -q smp #PBS -l ncpus=4 #PBS -V #PBS -N myjob cd $HOME/MyJobDir ./MyExe
and submit it with the qsub command
$ qsub MyScript.job 24204.hg1.hpc.sissa.it
where 24204 is the job ID, which is essential to monitor to job.
To see the detailed status of your job you can use the checkjob command, e.g.:
$ checkjob 24204 checking job 24204 State: Running Creds: user:myuid group:other class:zebra qos:DEFAULT WallTime: 4:58:13 of 12:00:00 SubmitTime: Thu Mar 26 12:02:05 (Time Queued Total: 00:00:54 Eligible: 00:00:01) StartTime: Thu Mar 26 12:02:59 Total Tasks: 16 Req[0] TaskCount: 16 Partition: DEFAULT Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0 Opsys: [NONE] Arch: [NONE] Features: [zebra] Allocated Nodes: [p013:8][p004:8] IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 1 PartitionMask: [ALL] Flags: RESTARTABLE Reservation '24204' (-4:57:27 -> 7:02:33 Duration: 12:00:00) PE: 16.00 StartPriority: 5088
To see all your jobs you can use the qstat command in this way
$ qstat -u myuid
hg1.hpc.sissa.it:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
24204.hg1.hpc.si myuid zebra cp2k_h2o_512 28267 2 -- -- 12:00 R 05:02
24205.hg1.hpc.si myuid blade cp2k_h2o_512 31944 16 -- -- 12:00 R 00:47
24206.hg1.hpc.si myuid zebra cp2k_h2o_512 21420 4 -- -- 08:00 R 04:53
Environment setting and Modules usage in PBS scripts
The actual PBS script will need to prepare the environment in which your code will be run. First of all you should know that PBS starts by default from user's home directory so if your code is located somewhere else you will need to switch to that location with the cd command. For example you will need to include in your script a line like
cd /scratch/myuid/myprogdir
The next thing you have to be aware of is that the environment provided by PBS is exactly the one you would get if you logged in via ssh. To load any module you need and that is not loaded automatically when you log in, you will need to add to your script something like
source /etc/profile.d/modules.sh module purge module load openmpi/1.3/intel/10.1 module load mkl
Please note that the only mandatory line is actually the first, the others depend on what you need to run your code.
