Job managing
Job managing with gLite
Basic commands
First at all, let's assume that our JDL file is called
script.jdl
, our proxy identifier is
dID
and the file that will store the job identifiers is
jobId
. Then, you can submit, check retrieve, etc... using the gLite commands:
Credentials delegation
This command delegates your credentials (proxy) to the
Workload Management System (WMS), assigns and holds an identification name (automatic or not), so that subsequent invocations of
glite-wms-job-submit
and
glite-wms-job-list-match
can be given that delegation name, bypassing the delegation of a new proxy.
Syntax:
glite-wms-job-delegate-proxy -d dID
Options:
-a
Creates an automatic identification name
-d dID
Creates the identification name
dID
Note that when using
-a
option, you have to specify
-a
option for each use of glite-wms-job-submit and glite-wms-job-list-match. However massive use of this option it's not recommended, since it delegates a new proxy for each command issued, and delegation is a time-consuming operation, so it's better to do it once with glite-wms-job-delegate-proxy and reuse it.
Job matching
This command checks and lists which are the CEs that match the right requirements for our JDL file, assuring that our jobs will run successfully.
Syntax:
glite-wms-job-list-match script.jdl
Options:
-a
Automatic delegation
-d dID
Uses a previous explicit delegation
Note that one of this options must be used.
-o file
Stores the list of CEs in the file
file
Job submission
This command submits jobs to the GRID. During the job submission, a job identifier is created for this job which will be unique and it has to be used to check the job status or to retrieve the output. The format is
https://Lbserver_address[:port]/unique_string
(be aware that it is not a web site).
Syntax:
glite-wms-job-submit script.jdl
Options:
-a
Automatic delegation
-d dID
Using an explicit delegation declared previously.
Una de estas opciones debe de ser utilizada.
-o jobId
Añade el identificador del trabajo en el archivo
jobId
(lo crea si no existe)
-r CE
Submits the job directly to a particular CE. With this option, the availability of the CE is not checked. The BrokerInfo is not created either.
Job status
This command gives information about the status of the job (or jobs).
Syntax:
glite-wms-job-status jobId1 ... jobIdN
Options:
-i jobId
Reads the file (or files) where the job identifiers are stored.
-o file
Stores the output in the file
file
.
-v n
Sets the output level (0, 1 or 2)
The possible job status are:
Flag | Meaning |
SUBMITTED | Job submitted by the user but not processed by the Network Server (NS). |
WAITING | Job has been acepted by the NS but not processed by the WMS. A job match is performed. |
READY | Job assigned and being sent to a particular CE. |
SCHEDULED | Job is scheduled in the CE queue manager. |
RUNNING | Job running in a WN of the selected CE queue. |
DONE | Job has finished without Grid errors. |
ABORTED | Job has been aborted by the WMS (for example, because it was too long for the queue in which it was running, because the proxy has expired, etc...) |
CANCELLED | Job has been cancelled by the user. |
CLEARED | The "OutputSandbox" has been transferred to a User Interface |
Job logging info
This command provides logging information of one or more submitted jobs. The syntax is as follows:
Syntax:
glite-wms-job-logging-info jobId1 ... jobIdN
Options:
-i jobId
Reads the file (or files) where the job identifiers are stored.
-o file
Store the output in the file
file
.
-v n
Sets the output level (0, 1 or 2).
**********************************************************************
LOGGING INFORMATION:
Printing info for the Job: https://lxshare0310.cern.ch:9000/C_CBUJKqc6Zqd4clQaCUTQ
- - -
Event: RegJob
- source = UserInterface
- timestamp = Fri Feb 20 10:30:16 2004
- - -
Event: Transfer
- destination = NetworkServer
- result = START
- source = UserInterface
- timestamp = Fri Feb 20 10:30:16 2004
- - -
Event: Transfer
- destination = NetworkServer
- result = OK
- source = UserInterface
- timestamp = Fri Feb 20 10:30:19 2004
- - -
Event: Accepted
- source = NetworkServer
- timestamp = Fri Feb 20 10:29:17 2004
- - -
Event: EnQueued
- result = OK
- source = NetworkServer
- timestamp = Fri Feb 20 10:29:18 2004
[...]
Job monitoring
The job monitoring allows to see the job "output" while the job itself is running. To enable this feature you have to add the attributes
PerusalFileEnable
and
PerusalTimeInterval
to the JDL file before submitting the job. For example:
PerusalFileEnable = true;
PerusalTimeInterval = 120;
This does that the WN uploads regularly (el intervalo de tiempo se define con
PerusalTimeInterval
en segundos) a copy of the job "output" to the WMS when the command
glite-wms-job-perusal
is used. To request the monitoring of a file, you have:
Syntax:
glite-wms-job-perusal --set -f file_1 ... -f file_n -i jobId1 ... jobIdn
Options:
-f
Filename to be perused.
and to retrieve the requested file:
Syntax:
glite-wms-job-perusal --get -f file ... -i jobId
Options:
-all
Retrieve file in full. By default only the portion written since last --get will be returned.
--dir directory
Stores the "output" in the directory
directory
. By default, it is done in
/tmp
.
Note: Obviously this feature has its impact over the performance. It is an option to debug your jobs and it is not recommended to use it in production because many transferred files can flood the WMS.
Once a job has been sent, the files you want to monitor have to be requested with the option
--set
.
The job monitoring can be disable with the option
-unset
.
Syntax:
glite-wms-job-perusal --unset jobId
It is possible to use the command
glite-wms-job-perusal
to check the final status of the files once the job has finished. If you want to use this "post-mortem" feature, the monitoring has to be enable with the option
glite-wms-job-perusal --set
but leaving the file retrieval till the job has finished.
Job cancelation
This command cancels a job (or jobs) with job identifier
jobId
.
Syntax:
glite-wms-job-cancel jobId1 ... jobIdN
Options:
-i jobId
Reads the file (or files) where the job identifiers are stored.
Note: If the job has not been transferred to the CE (i.e. its status is WAITING or READY), the canl request can be ignored and therefore the job can run even if you can read "successful cancellation". In these cases, simply repeat your cancel request when the job status is SCHEDULED or RUNNING.
Job output retrieval
This command retrieves the job "output", obviously for finished jobs, i.e. with status DONE, to the UI. If he "output" is not retrieved, it would be removed from the WMS about one week after the job ending.
Syntax:
glite-wms-job-output jobId1 ... jobIdN
Options:
-i jobId
Reads the file (or files) where the job identifiers are stored.
--dir directory
Stores the "output" in the directory
directory
. By default, it is done in
/tmp
.
Advance job management
Job Collections
A
job collection is a set of mutually independent jobs, which, for some reason known to the user, needs to be submitted, monitored and controlled as a single request. A good reason could be that the sub-jobs have common input files: in fact WM proxy allows the sharing and inheritance of sandboxes, and optimizes network traffic, transferring a single copy of each file even in case of multiple uses in sub-jobs.
DAG jobs
A
DAG (directed acyclic graphs)
job represents a set of jobs where the input, the output or the execution of one or more jobs depends on one or more jobs. The jobs are nodes (vertices) in the graph and the edges (arcs) represents the dependencies.
Parametric jobs
A
parametric job causes a set of jobs to be generated from one JDL file. This is invaluable in cases where many similar (but not identical) jobs must be run. This is achieved by the parametric job having one or more parametric attributes described in the JDL. These attributes are identified by use of the key word
PARAM in its value; that value will be replaced by the actual value of Parameters during the jdl expansion. The
JobType? in the JDL is Parametric.
--
CarlosEscobar - 18 Jun 2010