r8 - 22 Jul 2010 - 13:27:24 - CarlosEscobarYou are here: TWiki >  ECiencia Web  >  GridCSICMain > UsingGRIDCSIC > JobManagingGRIDCSIC

Job managing

Job managing with gLite

Basic commands

First at all, let's assume that our JDL file is called script.jdl, our proxy identifier is dID and the file that will store the job identifiers is jobId. Then, you can submit, check retrieve, etc... using the gLite commands:

Credentials delegation

This command delegates your credentials (proxy) to the Workload Management System (WMS), assigns and holds an identification name (automatic or not), so that subsequent invocations of glite-wms-job-submit and glite-wms-job-list-match can be given that delegation name, bypassing the delegation of a new proxy.

Syntax: glite-wms-job-delegate-proxy -d dID

Options:
-a Creates an automatic identification name
-d dID Creates the identification name dID

Note that when using -a option, you have to specify -a option for each use of glite-wms-job-submit and glite-wms-job-list-match. However massive use of this option it's not recommended, since it delegates a new proxy for each command issued, and delegation is a time-consuming operation, so it's better to do it once with glite-wms-job-delegate-proxy and reuse it.

Job matching

This command checks and lists which are the CEs that match the right requirements for our JDL file, assuring that our jobs will run successfully.

Syntax: glite-wms-job-list-match script.jdl

Options:
-a Automatic delegation
-d dID Uses a previous explicit delegation
Note that one of this options must be used.
-o file Stores the list of CEs in the file file

Job submission

This command submits jobs to the GRID. During the job submission, a job identifier is created for this job which will be unique and it has to be used to check the job status or to retrieve the output. The format is https://Lbserver_address[:port]/unique_string (be aware that it is not a web site).

Syntax: glite-wms-job-submit script.jdl

Options:
-a Automatic delegation
-d dID Using an explicit delegation declared previously.
Una de estas opciones debe de ser utilizada.
-o jobId Añade el identificador del trabajo en el archivo jobId (lo crea si no existe)
-r CE Submits the job directly to a particular CE. With this option, the availability of the CE is not checked. The BrokerInfo is not created either.

Job status

This command gives information about the status of the job (or jobs).

Syntax: glite-wms-job-status jobId1 ... jobIdN

Options:
-i jobId Reads the file (or files) where the job identifiers are stored.
-o file Stores the output in the file file.
-v n Sets the output level (0, 1 or 2)

The possible job status are:

Flag Meaning
SUBMITTED Job submitted by the user but not processed by the Network Server (NS).
WAITING Job has been acepted by the NS but not processed by the WMS. A job match is performed.
READY Job assigned and being sent to a particular CE.
SCHEDULED Job is scheduled in the CE queue manager.
RUNNING Job running in a WN of the selected CE queue.
DONE Job has finished without Grid errors.
ABORTED Job has been aborted by the WMS (for example, because it was too long for the queue in which it was running, because the proxy has expired, etc...)
CANCELLED Job has been cancelled by the user.
CLEARED The "OutputSandbox" has been transferred to a User Interface

Job logging info

This command provides logging information of one or more submitted jobs. The syntax is as follows:

Syntax: glite-wms-job-logging-info jobId1 ... jobIdN

Options:
-i jobId Reads the file (or files) where the job identifiers are stored.
-o file Store the output in the file file.
-v n Sets the output level (0, 1 or 2).

********************************************************************** 
LOGGING INFORMATION: 
 
Printing info for the Job: https://lxshare0310.cern.ch:9000/C_CBUJKqc6Zqd4clQaCUTQ 
 
        - - - 
 Event: RegJob 
- source               =    UserInterface 
- timestamp            =    Fri Feb 20 10:30:16 2004 
        - - - 
 Event: Transfer 
- destination          =    NetworkServer 
- result               =    START 
- source               =    UserInterface 
- timestamp            =    Fri Feb 20 10:30:16 2004 
        - - - 
 Event: Transfer 
- destination          =    NetworkServer 
- result               =    OK 
- source               =    UserInterface 
- timestamp            =    Fri Feb 20 10:30:19 2004 
        - - - 
 Event: Accepted 
- source               =    NetworkServer 
- timestamp            =    Fri Feb 20 10:29:17 2004 
        - - - 
 Event: EnQueued 
- result               =    OK 
- source               =    NetworkServer 
- timestamp            =    Fri Feb 20 10:29:18 2004 
[...]

Job monitoring

The job monitoring allows to see the job "output" while the job itself is running. To enable this feature you have to add the attributes PerusalFileEnable and PerusalTimeInterval to the JDL file before submitting the job. For example:

  PerusalFileEnable = true;
  PerusalTimeInterval = 120;

This does that the WN uploads regularly (el intervalo de tiempo se define con PerusalTimeInterval en segundos) a copy of the job "output" to the WMS when the command glite-wms-job-perusal is used. To request the monitoring of a file, you have:

Syntax: glite-wms-job-perusal --set -f file_1 ...  -f file_n -i jobId1 ... jobIdn

Options:
-f Filename to be perused.

and to retrieve the requested file:

Syntax: glite-wms-job-perusal --get -f file ...  -i jobId

Options:
-all Retrieve file in full. By default only the portion written since last --get will be returned.
--dir directory Stores the "output" in the directory directory. By default, it is done in /tmp.

Note: Obviously this feature has its impact over the performance. It is an option to debug your jobs and it is not recommended to use it in production because many transferred files can flood the WMS.

Once a job has been sent, the files you want to monitor have to be requested with the option --set.

The job monitoring can be disable with the option -unset.

Syntax: glite-wms-job-perusal --unset jobId

It is possible to use the command glite-wms-job-perusal to check the final status of the files once the job has finished. If you want to use this "post-mortem" feature, the monitoring has to be enable with the option glite-wms-job-perusal --set but leaving the file retrieval till the job has finished.

Job cancelation

This command cancels a job (or jobs) with job identifier jobId.

Syntax: glite-wms-job-cancel jobId1 ... jobIdN

Options:
-i jobId Reads the file (or files) where the job identifiers are stored.

Note: If the job has not been transferred to the CE (i.e. its status is WAITING or READY), the canl request can be ignored and therefore the job can run even if you can read "successful cancellation". In these cases, simply repeat your cancel request when the job status is SCHEDULED or RUNNING.

Job output retrieval

This command retrieves the job "output", obviously for finished jobs, i.e. with status DONE, to the UI. If he "output" is not retrieved, it would be removed from the WMS about one week after the job ending.

Syntax: glite-wms-job-output jobId1 ... jobIdN

Options:
-i jobId Reads the file (or files) where the job identifiers are stored.
--dir directory Stores the "output" in the directory directory. By default, it is done in /tmp.

Advance job management

Job Collections

A job collection is a set of mutually independent jobs, which, for some reason known to the user, needs to be submitted, monitored and controlled as a single request. A good reason could be that the sub-jobs have common input files: in fact WM proxy allows the sharing and inheritance of sandboxes, and optimizes network traffic, transferring a single copy of each file even in case of multiple uses in sub-jobs.

DAG jobs

A DAG (directed acyclic graphs) job represents a set of jobs where the input, the output or the execution of one or more jobs depends on one or more jobs. The jobs are nodes (vertices) in the graph and the edges (arcs) represents the dependencies.

Parametric jobs

A parametric job causes a set of jobs to be generated from one JDL file. This is invaluable in cases where many similar (but not identical) jobs must be run. This is achieved by the parametric job having one or more parametric attributes described in the JDL. These attributes are identified by use of the key word PARAM in its value; that value will be replaced by the actual value of Parameters during the jdl expansion. The JobType? in the JDL is Parametric.

-- CarlosEscobar - 18 Jun 2010

Edit | WYSIWYG | Attach | PDF | Raw View | Backlinks: Web, All Webs | History: r8 < r7 < r6 < r5 < r4 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback