How to use the GRID-CSIC resources
Once you have your digital certificate and you belong to a
Virtual Organization (VO) supported by the GRID-CSIC, you can use all the resourses which this infrastructure provides. In this twiki, you will find some examples that can be used as tutorials in order to be able to start using the GRID-CSIC successfully. In addition, taking benefit of this tutorial, the GRID terminology is introduced.
In order to be able to submit jobs to the GRID-CSIC
Worker Nodes (WN) from IFIC, you have to be logged in some of the
User Interfaces using
ssh
. The UI list is available
here. You can find more information about the GRID-CSIC infrastructure in the following
twiki.
Preliminary knowledge
Job Description Language (JDL)
The Grid jobs are defined through text files written in an specific language called
Job Description Language (JDL). The JDL is based on the
classAd language and its syntax consists basically on statements ended by a semicolon, like:
attribute = value;
The simple language is sensitive to blank characters and tabs. It is important to take into account that neither blank characters nor tabs should follow the semicolon at the end of a line (i.e.
;
). Literal strings are enclosed in double quotes. To include double quotes a backslash must be used (i.e.: Arguments = "\"run\" 10"). The same is applied for special characters:
\&
(for
&
) o
\\\&
(for
\&
). To conclude, single quotes (
'
) are not allowed. Comments must be preceded by
#
, // or
/*
and
*/
.
Attributes
The most used attributes that one can define in a JDL file are:
Attribute | Mandatory? | Meaning | Example |
InputSandbox | No | ATransfer input files from UI to WN | InputSandbox = {"test.sh","std.in"}; |
OutputSandbox | No | Transfer output files from WN to UI | OutputSandbox = {"std.out","std.err"}; |
PerusalFileEnable | No | Enable job perusal | PerusalFileEnable = true; |
PerusalTimeInterval | No | ESpecify in seconds frequency that specified files are copied to WMS machine (in seconds) | PerusalTimeInterval = 30; |
ShallowRetryCount | No | In case of Grid error, retry the job X times | ShallowRetryCount = 3; |
StdError | Yes | Specify standard error | StdError = "std.err"; |
StdInput | No | Specify standard input | StdInput = "std.in"; |
StdOutput | Yes | DSpecify standard output | StdOutput = "std.out"; |
VirtualOrganisation | No | Define the Virtual Organization (VO) | VirtualOrganisation = "ific"; |
Arguments | No | Supply arguments to the executable | Arguments = "run 10"; |
Environment | No | Extend the environment | Environment = {"CMS_PATH=$HOME/cms","CMS_DB=$CMS_PATH/cmdb"}; |
Executable | Yes | Specify executable | Executable = "test.sh"; |
Rank | No | Apply a weight to select CE | Rank = other.GlueCEStateFreeCPUs; |
Requirements | No | Imposing Constraints on the CE | Requirements = other.GlueCEInfoLRMSType == "PBS"; |
For more information, take a look to the
Job Description Language (JDL) Attribute document.
Comments
- StdError
- Can be the same as StdOutput.
- InputSandbox
- Wildcards allowed (
*
).
- Files are relative to current directory.
- The executable flag is not preserved for the files included in the InputSandbox when transferred to the WN. These execution permissions should be performed (
chmod +x
) by the initial script specified as the Executable in the JDL file (the chmod +x operation is done automatically for this script).
- The InputSandbox cannot contain two files with the same name (even if in different paths) as when transferred they would overwrite each other.
- OutputSandbox
- No absolute file names.
- The OutputSandbox cannot contain two files with the same name (even if in different paths) as when transferred they would overwrite each other.
- At IFIC, there is a limit of 50 MB in the size of files declared in the OutputSandbox. If a file is larger that this limit a tail is performed. For text files no problems are expected but zip files will be corrupted. If you have files larger than the actual limit, please store them in the SE.
- Requirements. The Requirements attributes can be used to express any kind of constraint on the resources where the job can run. Its value is a Boolean expression that must evaluate to true for a job to run on that specific CE. They are based on the GLUE Schema.
- Forming expressions. For example, to force a job to only run on a particular CE:
Requirements = other.GlueCEUniqueID == "ce02.ific.uv.es:2119/jobmanager-pbs-infinibandShort";
The other.
prefix is used to indicate that the GlueCEUniqueID
attribute refers to the CE characteristics (its ID in particular) and not to those of the job. If other.
is not specified, then the default self.
is assumed, indicating that the attribute refers to the job characteristics description.
Requirements can be ANDed together:
Requirements = other.GlueCEInfoHostName == "ce02.ific.uv.es" && other.GlueCEStateFreeCPUs > 10;
which ANDs in the requirement that there are at least 10 free CPUs on this specific CE. By default the system always ANDs in other.
requirement:
Requirements = other.GlueCEStateStatus == "Production" ;
A requirement can be negated:
Requirements = (!other.GlueCEInfoTotalCPUs < 10);
- Functions
- Member. One essential requirement for production work is that the machine has the appropriate operating system (OS) and/or software installed. The attribute we need to test in this case is GlueHostOperatingSystemName but there is a complication: it is a list and all we require is that our OS is on the list. This is done with:
Requirements = Member("ScientificSL",other.GlueHostOperatingSystemName);
The Member function is satisfied if the first argument is a member of its second argument (a list). Functions can be also ANDed:
Requirements = Member("ScientificSL",other.GlueHostOperatingSystemName)
&& Member("SL",other.GlueHostOperatingSystemVersion)
&& Member("5.3",other.GlueHostOperatingSystemRelease);
- RegExp. Another function RegExp can be used to see if a supplied matches as as regular expression, for example:
Requirements = RegExp("ce02.ific.uv.es", other.GlueCEInfoHostName);
- Gangmatching.
The previous requirements affected always two entities: the job and the CE. In order to specify requirements involving three entities (i.e., the job, the CE and a SE), the RB uses a special match-making mechanism, called gangmatching. This is supported by some JDL functions: anyMatch, whichMatch, allMatch. For example to ensure that the job runs on a CE with, at least 200 MB of free disk space on a close SE, the following JDL expression can be used:
Requirements = anyMatch(other.storage.CloseSEs,target.GlueSAStateAvailableSpace > 204800);
Example
As a simple example, a JDL file to say "Hello World" would be:
Executable = "/bin/echo";
Arguments = "Hello World";
StdOutput = "stdout.log";
StdError = "stderr.log";
OutputSandbox = {"stdout.log","stderr.log"};
Authentication and authorization
Once, you are logged in a UI, you have to load your certificate and proceed with the VO registration (for example, the ific's VO, though you can use any supported VO) through the command:
voms-proxy-init -voms ific
Proxy validity
By default, the proxy will have a 12 hours validity. To extend it up to 24 hours, you can use the command:
voms-proxy-init -voms ific -valid 24:00
To increase it even more, (by default 7 days), the following command can be used:
voms-proxy-init -voms ific
myproxy-init -d -n -s lcg2proxy.ific.uv.es
then, you have to include the attribute
MyProxyServer="lcg2proxy.ific.uv.es"
to the corresponding JDL file.
Environment variables
To conclude, you may need to set some environment variables (such as the data catalog):
export LFC_HOST=lfc02.ific.uv.es
export LFC_HOME=/grid/ific/
Middleware gLite
gLite is the middleware use nowadays in
Grid Computing. Developed by an international collaboration within the
EGEE project, gLite provides a solid framework to develop applications which benefit from distributing computing and storage.
In that sense, the commands to submit, retrieve and check (and many others) jobs
à la Grid start with
glite-wms-job-*
. Here, a summary of the most important commands will be presented though you can use advance commands. All of them are discussed in
Workload Management section from the
gLite User Guide.
Job Managing with gLite
More information:
Job Managing
Data Managing with gLite
More information:
Data Managing
Exercises
--
CarlosEscobar - 14 Jun 2010