Advanced Job Submission

1.  Advanced Sandbox Management


There is the possibility to include input sandbox files stored not on the UI, but on a GridFTP server, and, similarly, to specify that files should be transferred to a GridFTP server when the job finish.

Here is an example:

1.1  Choose files

Decide which file will be needed for the job execution, for example:

[user@ui-2 ~]$ file prime
prime: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5,
statically linked, not stripped 

prime is a binary file statically compiled that calculates a sequence of prime numbers.

1.2  List SEs

For the list of the storage elements:

[user@ui-2 ~]$ lcg-infosites --vo gridbox se
Avail Space(Kb) Used Space(Kb)  Type    SEs
----------------------------------------------------------
47628864        86140           n.a     se-1.grid.box
47628864        86140           n.a     se-1.grid.box

1.3  Create directory

Create a directory and copy the file using globus-url-copy command

[user@ui-2 ~]$ edg-gridftp-mkdir gsiftp://se-1.grid.box/tmp/user/
[user@ui-2 ~]$ globus-url-copy -vb  file:///home/user/prime gsiftp://se-1.grid.box/tmp/user/prime
Source: file:///home/user/
Dest:   gsiftp://se-1.grid.box/tmp/user/
  prime
       434885 bytes         1.80 MB/sec avg         1.80 MB/sec inst

we copied the file in the /tmp/user directory of the GridFTP server.

1.4  Write script

Write down the script which uses the binary file

  job.sh
 ===========
 #!/bin/sh
 chmod 755 prime
 ./prime
 ===========

1.5  GridFTP files

We specify the files stored in the GridFTP server as GridFTP URI in the InputSandbox attribute. In our case we have the file prime in the /tmp/user directory of the se-1.grid.box so:

InputSandbox = {"gsiftp://se-1.grid.box/tmp/user/prime"};

It is also possible to specify a base GridFTP URI with the attribute InputSandboxBaseURI: in this case,files expressed as simple file names or as relative paths will be looked for under that base URI. Local files can still be defined using the file://<path> URI format. For example:

InputSandbox = {"prime", "file:///home/user/test};
InputSandboxBaseURI = "gsiftp://se-1.grid.box/tmp/user";

is equivalent to

InputSandbox = {"gsiftp://se-1.grid.box/tmp/user/prime",
                "/home/user/test"};

In order to store the output sandbox to a GridFTP the OutputSandboxDestURI attribute must be used together with the usual OutputSandbox attribute. The latter is used to list the output files created by the job in the WN to be transferred, and the former is used to express where the output files are to be trasferred. For example:

OutputSandbox = {"userout.log","usererr.log"};
OutputSandboxDestURI = {"gsiftp://se-1.grid.box/tmp/user/userout.log","gsiftp://se-1.grid.box/tmp/user//usererr.log"};

In this case clearly, glite-wms-job-output, when the job has finished will not retrieve no results, because they will be at GridFTP server.

Another possibility is to use the OutputSandboxBaseDestURI attribute to specify a base URI on a GridFTP server where the files listed in OutputSandbox will be copied. For example:

OutputSandbox = {"userout.log", "usererr.log"};
OutputSandboxBaseDestURI = "gsiftp://se-1.grid.box/tmp/user/";

will copy both files under the specified GridFTP URI.

1.6  Write JDL

Write down the .jdl file.

   GridFTPTest.jdl
=======================
[
Executable = "job.sh";
StdOutput = "userout.log";
StdError = "usererr.log";
InputSandbox = {"job.sh","gsiftp://se-1.grid.box/tmp/user/prime"};
OutputSandbox = {"userout.log", "usererr.log"};
OutputSandboxBaseDestURI = "gsiftp://se-1.grid.box/tmp/user/";
]
=======================

1.7  Submit the job

[user@ui-2 ~]$ glite-wms-job-submit -a -o jobid GridFTPTest.jdl

Connecting to the service https://wms-4.grid.box:7443/glite_wms_wmproxy_server

====================== .glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.box:9000/ozbQfIEvwy1b1Gmk0_FDCA

The job identifier has been saved in the following file:
/home/user/jobid

===========================================================================

1.8  Check job status

[user@ui-2 ~]$ glite-wms-job-status -i jobid


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.box:9000/ozbQfIEvwy1b1Gmk0_FDCA
Current Status:     Done (Success)
Logged Reason(s):
    -
    - Job terminated successfully
Exit code:          0
Status Reason:      Job terminated successfully
Destination:        ce-1.grid.box:2119/jobmanager-lcgpbs-gridbox
Submitted:          Tue Sep 30 09:51:21 2008 CEST
*************************************************************

1.9  Get Output

If try to retrieve the output using the glite-wms-job-output

[user@ui-2 ~]$ glite-wms-job-output -i jobid -o . --dir  subdir

Connecting to the service https://10.10.0.9:7443/glite_wms_wmproxy_server

================================================================================  
                             JOB GET OUTPUT OUTCOME

No output files to be retrieved for the job:
https://wms-4.grid.box:9000/ozbQfIEvwy1b1Gmk0_FDCA

================================================================================

This command will create a directory called subdir in the current directory and will download the output inside.

1.10  Get Output with GridFTP

Use the globus-url-copy command for get the desired output.

[user@ui-2 ~]$ globus-url-copy -vb gsiftp://se-1.grid.box/tmp/user/userout.log file:/home/user/userout.log
Source: gsiftp://se-1.grid.box/tmp/user/
Dest:   file:/home/user/
  userout.log

1.11  Display results

Here are the results of the job.

[user@ui-2 ~]$ cat userout.log
       1       2       3       5       7      11      13      17
      19      23      29      31      37      41      43      47
      53      59      61      67      71      73      79      83
      89      97     101     103     107     109     113     127
     131     137     139     149     151     157     163     167
     ...     ...     ...     ...     ...     ...     ...     ...

2.  Real Time Output Retrieval


Inspecting the job output in real time. The user can enable the job perusal by setting the attribute PerusalFileEnable to true in the job JDL. This makes the WN to upload at regular time intervals (defined by the PerusalTimeIntrval attribute and expressed in seconds), a copy of the output files specified using the glite-wms-job-perusal command.

For example

1. Create a simple bash script that writes out the hostname

#!/bin/sh
#
/bin/hostname

2. The JDL files should like this:

     PerusalTest.jdl
===========================================
[
Executable = "job.sh";
StdOutput = "stdout.log";
StdError = "stderr.log";
InputSandbox = {"job.sh"};
OutputSandbox = {"stdout.log","stderr.log"};
PerusalFileEnable = true;
PerusalTimeInterval = 15;
RetryCount = 0;
]
===========================================

3. Submit the job with glite-wms-job-submit. To enable the job perusal use the glite-wms-job-perusal command. The user may select which output to be inspected. using -f nameoffile.

[user@ui-2 ~]$ glite-wms-job-perusal --set -f stdout.log  -f stderr.log 
https://wms-4.grid.box:9000/IGYXVHG6LvmmyV3oBp3B9g

Connecting to the service https://10.10.0.9:7443/glite_wms_wmproxy_server

===================== .glite-wms-job-perusal Success =====================

Files perusal has been successfully enabled for the job:
https://wms-4.grid.box:9000/IGYXVHG6LvmmyV3oBp3B9g

==========================================================================

4.When the job starts, the user may inspect:

[user@ui-2 ~]$ glite-wms-job-perusal --get -f stdout.log -o . --dir subdir
https://wms-4.grid.box:9000/IGYXVHG6LvmmyV3oBp3B9g

Connecting to the service https://10.10.0.9:7443/glite_wms_wmproxy_server

===================== .glite-wms-job-perusal Success =====================

The retrieved files have been successfully stored in:
/tmp/user_IGYXVHG6LvmmyV3oBp3B9g

==========================================================================

--------------------------------------------------------------------------
file 1/1: stdout.log-20080930141119_1-20080930141119_1
--------------------------------------------------------------------------

This command will create a directory called subdir in the current directory and will download the output inside

5. See the results

[user@ui-2 user_IGYXVHG6LvmmyV3oBp3B9g]$ cat stdout.log-20080930141119_1-20080930141119_1

ce-2wn1.grid.box

3.  Advanced Job Types


3.1  Job Collection

One of the most useful functionalities of WMProxy is the ability to submit job collections, defined as a set of independent jobs.

Here is an example of what it means and how to do it.

The simplest way to submit a collection is to put the JDL files of all the jobs in the collection in a single directory, and use --collection <dirname>, where <dirname> is the name of the directory.

So:

[user@ui-2 ~]$ mkdir jdl

you mist put your JDL files in the jdl/ directory. Suppose that you have the following two jobs:

      job1.jdl
====================
[
Executable="/bin/hostname";
StdOutput="std.out";
StdError="std.err";
OutputSandbox={"std.out","std.err"};
]
====================

      job2.jdl
====================
[
Executable = "/bin/echo";
StdOutput = "std.out";
StdError = "std.err";
Arguments = "Hello Trieste!";
OutputSanbox = {"std.out","std.err"};
]
===================

Submit both jobs at the same time by doing

[user@ui-2 ~]$ glite-wms-job-submit -a -o jobid --collection jdl/

Connecting to the service
https://wms-4.grid.box:7443/glite_wms_wmproxy_server

====================== .glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.box:9000/hBkspJ9W34qZz5vQaSFGTQ

The job identifier has been saved in the following file:
/home/user/jobid

==========================================================================

The jobID returned refers to the collection itself. To know the status of the collection and of all the job belonging to it, it is enough to use glite-wms-job-status as for any other kind of job:

[user@ui-2 ~]$ glite-wms-job-status https://wms-4.grid.box:9000/hBkspJ9W34qZz5vQaSFGTQ

=============================================================
                BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.box:9000/hBkspJ9W34qZz5vQaSFGTQ
Current Status:     Waiting
Submitted:          Wed Oct  1 10:39:03 2008 CEST
=============================================================

- Nodes information for:
    Status info for the Job : https://wms-4.grid.box:9000/07bXsPOCN7BryYRW94SK5g
    Current Status:     Scheduled
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-1.grid.box:2119/jobmanager-lcgpbs-gridbox
    Submitted:          Wed Oct  1 10:39:03 2008 CEST
=============================================================

    Status info for the Job : https://wms-4.grid.box:9000/6ryYhbM82bWPKIbOlAkTTw
    Current Status:     Scheduled
    Status Reason:      Job successfully submitted to Globus
    Destination:        ce-1.grid.box:2119/jobmanager-lcgpbs-gridbox
    Submitted:          Wed Oct  1 10:39:03 2008 CEST
=============================================================

Note: executing the glite-wms-job-status for the collection is the only way to know the jobIDs of the job in the collection.

3.2  Advanced Collection

A more flexible way to define a job collection is shown in the following JDL file. Its structure includes a global set of attributes, which are included by all the sub-jobs, and a set of attributes for each sub-job, which supersede the global ones.

[
Type = "Collection";
VirtualOrganisation = "gridbox";
MyProxyServer = "mpyroxy.grid.box";
InputSandbox = {"numbers"};
StdOutput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.err", "std.err"};
DefaultNodeShallowRetryCoony = 5;

Nodes = {
        [
        Executable = "node1.sh";
        InputSandbox = {root.InputSandbox, "node1.sh"};
        StdOutput = "myoutput1.txt";
        StdError = "std.err";
        OutputSandbox = {"myoutput1.txt","std.err"};
        Requirements = other.GlueCEPolicyMaxWallClockTime > 10;
        ],
        [
        NodeName = "mysubjob";
        Executable = "node2.sh";
        InputSandbox = {root.InputSandbox,"node2.sh"};
        StdOutput = "myoutput2.txt";
        StdError = "std.err";
        OutputSandbox = {"myoutput2.txt", "std.err"};
        ]
        }
]

The file numbers is an executable, it prints out the first n numbers in the Fibonacci sequence, the relative executable (node1.sh and node2.sh) give a different paramter for the two jobs. For example the file node1.sh is like this:

     
#!/bin/sh
# node1.sh
#
chmod 755 numbers
./numbers 24
==================== 

And so:

a. Type = "Collection"; describes a collection.

b. the job belong to the gridbox VO.

c. the Myproxy server to use for proxy renewal is mpyroxy.grid.box.

d. all the jobs in the collection have by default the binary "numbers" in their sandbox(shared input sandbox).

e. the default maximum number of shallow resubmission is 5.

f. the input sandbox of the first job (or node) has all the default files (root.InputSandbox), plus an additional file, node1.sh, as like as the second.

g. the first job must run on a CE allowing at least ten minutes of wall clock time.

h. the two jobs have names node0, mysubjob.

Submit the job, retrieve the output when jobs finished.

[user@ui-2 ~]$ glite-wms-job-output https://wms-4.grid.box:9000/twBf0M2QmQPej0OCGGXdFw

================================================================================
                        JOB GET OUTPUT OUTCOME

Output sandbox files for the DAG/Collection :
https://wms-4.grid.box:9000/twBf0M2QmQPej0OCGGXdFw
have been successfully retrieved and stored in the directory:
/tmp/user_twBf0M2QmQPej0OCGGXdFw

================================================================================


[user@ui-2 user_twBf0M2QmQPej0OCGGXdFw]$ ls -l
total 12
-rw-rw-r--  1 user user  354 Oct  1 14:06 ids_nodes.map
drwxr-xr-x  2 user user 4096 Oct  1 14:06 mysubjob
drwxr-xr-x  2 user user 4096 Oct  1 14:06 Node_0

4.  Parametric Jobs


A parametric job is a job collection where the jobs are identical but for a value of running parameter. It is described by a single JDL, where attribute values may contain the current value or the running parameter. An example of a JDL for a parametric job follows:

     [
        Type = "job";
        JobType = "parametric";
        Executable = "job.sh";
        StdInput = "input_PARAM_.txt";
        StdOutput = "output_PARAM_.txt";
        Parameters = 10;
        ParameterStart = 1;
        ParameterStep = 1;
        InputSandbox = {"job.sh", "input_PARAM_.txt"};
        OutputSandbox = "output_PARAM_.txt";
     ]

The submission of this job will produce 10 jobs as follows:

     [
        Type = "job";
        JobType = "normal";
        Executable = "job.sh";
        StdInput = "inputi.txt";
        StdOutput = "outputi.txt";
        InputSandbox = {"job.sh", "inputi.txt"};
        OutputSandbox = "outputi.txt";
     ]

i = 1, 2, ..., 10 

The JobType attribute is set as parametric. The special key _PARAM_ indicates the parametric attribute. This string is replaced with a proper value during the job submission. The attribute Parameters can be either a number, or a list of items(typically strings but not enclosed within doubel quotes): in the first case, the value repesent the maximum value ot the runninig parameter _PARAM_; in the second case, it is the list of the values the parameter must take.ParameterStart is the initial number of the running parameter and ParameterStep is the increment of the running parameter between consecutive jobs. Here is another example where the parameter attribute is a list of values:

        Parametric.jdl
================================
[
JobType = "Parametric";
Executable = "/bin/cat";
Arguments = "input_PARAM_.txt";
InputSandbox = "input_PARAM_.txt";
StdOutput = "myoutput_PARAM_.txt";
StdError = "myerror_PARAM_.txt";
Parameters = {EARTH,MARS,MOON};
OutputSandbox = {"myoutput_PARAM_.txt"};
]
===============================

Submission of the previous JDL produces a submission of 3 jobs with the following JDL:

     [
        Type = "job";
        JobType = "normal";
        Executable = "/bin/cat";
        StdInput = "inputvalue.txt";
        StdOutput = "myoutputvalue.txt";
        StdError = "myerrorvalue.txt";
        InputSandbox = "inputvalue.txt";
        OutputSandbox = {"myoutputvalue.txt","myerrorvalue.txt"};
     ]

value =  "EARTH", "MARS", "MOON"

So you must have the following files before submitting your job:

[user@ui-2 param]$ ls
inputEARTH.txt  inputMARS.txt  inputMOON.txt  Parametric.jdl

[user@ui-2 param]$ cat inputEARTH.txt
Testing of a parametric job.
Hello from Earth!

[user@ui-2 param]$ glite-wms-job-submit -a -o jobid Parametric.jdl

Connecting to the service https://wms-4.grid.box:7443/glite_wms_wmproxy_server


====================== .glite-wms-job-submit Success ======================

The job has been successfully submitted to the WMProxy
Your job identifier is:

https://wms-4.grid.box:9000/B5Ro6Bgl7AKm_VKmMYW9ug

The job identifier has been saved in the following file:
/home/user/param/jobid

==========================================================================


[user@ui-2 param]$ glite-wms-job-status -i jobid

*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job : https://wms-4.grid.box:9000/B5Ro6Bgl7AKm_VKmMYW9ug
Current Status:     Done (Success)
Exit code:          0
Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

- Nodes information for:
    Status info for the Job : https://wms-4.grid.box:9000/3vsUAk4ND3HEImjRsTizFg
    Current Status:     Done (Success)
    Logged Reason(s):
        -
        - Job terminated successfully
    Exit code:          0
    Status Reason:      Job terminated successfully
    Destination:        ce-1.grid.box:2119/jobmanager-lcgpbs-gridbox
    Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

    Status info for the Job : https://wms-4.grid.box:9000/CxsZV2RZtb87eShH7-eaBA
    Current Status:     Done (Success)
    Logged Reason(s):
        -
        - Job terminated successfully
    Exit code:          0
    Status Reason:      Job terminated successfully
    Destination:        ce-1.grid.box:2119/jobmanager-lcgpbs-gridbox
    Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

    Status info for the Job : https://wms-4.grid.box:9000/ZI_vXveak1Zo-URjY_zjtQ
    Current Status:     Done (Success)
    Logged Reason(s):
        -
        - Job terminated successfully
    Exit code:          0
    Status Reason:      Job terminated successfully
    Destination:        ce-1.grid.box:2119/jobmanager-lcgpbs-gridbox
    Submitted:          Tue Oct  7 14:50:08 2008 CEST
*************************************************************

Please retrieve the output using glite-wms-job-output

So we have:

[user@ui-2 user_B5Ro6Bgl7AKm_VKmMYW9ug]$ ls
ids_nodes.map  Node_EARTH  Node_MARS  Node_MOON

[user@ui-2 Node_EARTH]$ ls
myoutputEARTH.txt

[user@ui-2 Node_EARTH]$ cat myoutputEARTH.txt
Testing of a parametric job.
Hello from Eaeth!

5.  Job submission with MYPROXY


  • The JDL
[
Executable    = "testmyproxy.sh";
StdOutput     = "testmyproxy-out.log";
StdError      = "testmyproxy-err.log";
MyProxyServer = <MYPROXY_SERVER>;
InputSandbox  = {"testmyproxy.sh"};
OutputSandbox = {"testmyproxy-out.log","testmyproxy-err.log"}
]

  • The script
#!/bin/bash
#
# script
#
echo "Starting at: "$(date +'%Y-%m-%d %H:%M:%S %z %Z')

HOSTNAME=$(hostname -f)
USER=$(whoami)
ARG1=$1
LOCALDIR=$(pwd)

echo "****************************************"
echo "HOST: "$HOSTNAME
echo "USER: "$USER
echo "ARGS: "$ARG1
echo "LOCALDIR is: "$LOCALDIR
echo "HOMEDIR is:"$HOME
echo "Content of home:"
ls -l $HOME
echo "Content of current dir:"
ls -l .
echo "****************************************"

# view proxy info
voms-proxy-info --all

#
# Wait for more than one hour 1h and 30 minutes
#
for i in $(seq 1 30)
do
  printf "Waiting 60 sec ... "
  sleep 60
  echo "done"
done

# view proxy info
voms-proxy-info --all

echo "Ending at: "$(date +'%Y-%m-%d %H:%M:%S %z %Z')
  • Job delegation to myproxy server
myproxy-init --voms euindia -d -n -s <MYPROX_SERVER>
  • Create a proxy for submission
voms-proxy-init --voms euindia
  • Submit the job:
glite-wms-job-submit -a <jdl>