Submitting jobs

All the resources of the the SISSA/Democritos infrastructure are managed through the Torque/Portable Batch System (Torque/PBS), which is a workload management system for Linux clusters. Besides managing the access to the resources it provides commands to submit, monitor and delete jobs. It's key components are:

  • the Job Server or PBS server which provides the basic batch services such as receiving, creating and running a batch job, modifying it and protecting it against system crashes
  • the Job Executor or PBS mom, which is a daemon that actually takes care of the execution when it receives a copy of the job from the Job Server. The PBS mom creates a new session as similar as possible to a user login session returns the job output to the user
  • the Job Scheduler, which is a daemon that contains the site's policy controlling which job is run and where and when it is run. PBS allows each site to create its own scheduler. On the SISSA/Democritos infrastructure the Maui scheduler is being used. The Maui scheduler can communicate with various moms to keep track of the system's resources and with the server to monitor the availability of jobs to execute.

All the computing nodes available on the SISSA/Democritos infrastructure, namely the HG1 cluster, are divided in partitions, which are identified by the scheduler as different queues. In the table below all the different queues are listed, indicating the nodes hardware specifications.

Queue Name# of Nodes# of CoresNodes nameCPU (per node)RAM (per node)Network
zebra22176p0xxIntel Xeon E5420 2.5GHz (2x4 cores)16GBInfiniband 20G
cmbzebra clone, reserved to Planck project people, w/ higher priority
blade88352m0xx/cxxxAMD Opteron 280 2.4GHz (2x2 cores)8GBInfiniband 10G
iblade56224ixxxAMD Opteron 275 2.2GHz (2x2 cores)8GBInfiniband 2x2.5G
smprouting queue to submit jobs on the following two execution queues
smp41248a2xxAMD Opteron 275 2.2GHz (2x2 cores)8GBn.a.
smp2*2346a0xxAMD Opteron 252 2.6GHz (2x1 cores)4GBn.a.
up*2346a0xxAMD Opteron 252 2.6GHz (2x1 cores)4GBn.a.

*Note: smp2 an up queues share the same physical machines

In the following sections detailed instructions are provided to help you submit your jobs to the system.

Important! Read the PBS How-To before jumping to other sections, instructions available there are important for any type of submission!