r10 - 06 Jul 2010 - 18:41:21 - CarlosEscobarYou are here: TWiki >  ECiencia Web  >  GridCSICMain > UsingGRIDCSIC

How to use the GRID-CSIC resources

Once you have your digital certificate and you belong to a Virtual Organization (VO) supported by the GRID-CSIC, you can use all the resourses which this infrastructure provides. In this twiki, you will find some examples that can be used as tutorials in order to be able to start using the GRID-CSIC successfully. In addition, taking benefit of this tutorial, the GRID terminology is introduced.

In order to be able to submit jobs to the GRID-CSIC Worker Nodes (WNs) from IFIC (or elsewhere), you have to be logged in some of the User Interfaces using ssh. The UI list is available here. You can find more information about the GRID-CSIC infrastructure in the following twiki.

Preliminary knowledge

Job Description Language (JDL)

The Grid jobs are defined through text files written in an specific language called Job Description Language (JDL). The JDL is based on the classAd language and its syntax consists basically on statements ended by a semicolon, like: attribute = value;

The simple language is sensitive to blank characters and tabs. It is important to take into account that neither blank characters nor tabs should follow the semicolon at the end of a line (i.e. ;). Literal strings are enclosed in double quotes. To include double quotes a backslash must be used (i.e.: Arguments = "\"run\" 10"). The same is applied for special characters: \& (for &) o \\\& (for \&). To conclude, single quotes (') are not allowed. Comments must be preceded by #, // or /* and */.

Attributes

The most used attributes that one can define in a JDL file are:

Attribute Mandatory? Meaning Example
Executable Yes Specify executable Executable = "test.sh";
Arguments No Supply arguments to the executable Arguments = "run 10";
StdOutput Yes Specify standard output StdOutput = "std.out";
StdError Yes Specify standard error StdError = "std.err";
StdInput No Specify standard input StdInput = "std.in";
InputSandbox No Transfer input files from UI to WN (through WMS) InputSandbox = {"test.sh","std.in"};
OutputSandbox No Transfer output files from WN to UI (through WMS) OutputSandbox = {"std.out","std.err"};
Environment No Extend the environment Environment = {"CMS_PATH=$HOME/cms","CMS_DB=$CMS_PATH/cmdb"};
Requirements No Imposing Constraints on the CE Requirements = other.GlueCEInfoLRMSType == "PBS";
Rank No Apply a weight to select CE Rank = other.GlueCEStateFreeCPUs;
PerusalFileEnable No Enable job perusal PerusalFileEnable = true;
PerusalTimeInterval No Specify the frequency that specified files are copied to WMS machine (in seconds) PerusalTimeInterval = 30;
VirtualOrganisation No Define the Virtual Organization (VO) VirtualOrganisation = "ific";
ShallowRetryCount No In case of Grid error, retry the job X times ShallowRetryCount = 3;

For more information, take a look to the Job Description Language (JDL) Attribute document.

Comments

  • StdError
    • Can be the same as StdOutput.
  • InputSandbox
    • Files in the list are transferred from the UI to the WN through the WMS.
    • Wildcards allowed (*).
    • Files are relative to current directory.
    • The executable flag is not preserved for the files included in the InputSandbox when transferred to the WN. These execution permissions should be performed (chmod +x) by the initial script specified as the Executable in the JDL file (the chmod +x operation is done automatically for this script).
    • The InputSandbox cannot contain two files with the same name (even if in different paths) as when transferred they would overwrite each other.
  • OutputSandbox
    • Files in the list are transferred from the UI to the WN through the WMS.
    • No absolute file names.
    • The OutputSandbox cannot contain two files with the same name (even if in different paths) as when transferred they would overwrite each other.
    • At IFIC, there is a limit of 50 MB in the size of files declared in the OutputSandbox. If a file is larger that this limit a tail is performed. For text files no problems are expected but zip files will be corrupted. If you have files larger than the actual limit, please store them in the SE.

  • Requirements. The Requirements attributes can be used to express any kind of constraint on the resources where the job can run. Its value is a Boolean expression that must evaluate to true for a job to run on that specific CE. They are based on the GLUE Schema.
    • Forming expressions. For example, to force a job to only run on a particular CE:
         Requirements = other.GlueCEUniqueID == "ce02.ific.uv.es:2119/jobmanager-pbs-infinibandShort";
         
      The other. prefix is used to indicate that the GlueCEUniqueID attribute refers to the CE characteristics (its ID in particular) and not to those of the job. If other. is not specified, then the default self. is assumed, indicating that the attribute refers to the job characteristics description.
      Requirements can be ANDed together:
         Requirements =  other.GlueCEInfoHostName == "ce02.ific.uv.es" && other.GlueCEStateFreeCPUs > 10;
         
      which ANDs in the requirement that there are at least 10 free CPUs on this specific CE. By default the system always ANDs in other. requirement:
         Requirements = other.GlueCEStateStatus == "Production" ;
         
      A requirement can be negated:
         Requirements =  (!other.GlueCEInfoTotalCPUs < 10);
         
    • Functions
      • Member. One essential requirement for production work is that the machine has the appropriate operating system (OS) and/or software installed. The attribute we need to test in this case is GlueHostOperatingSystemName but there is a complication: it is a list and all we require is that our OS is on the list. This is done with:
           Requirements = Member("ScientificSL",other.GlueHostOperatingSystemName);
           
        The Member function is satisfied if the first argument is a member of its second argument (a list). Functions can be also ANDed:
            Requirements = Member("ScientificSL",other.GlueHostOperatingSystemName)
            && Member("SL",other.GlueHostOperatingSystemVersion)
            && Member("5.3",other.GlueHostOperatingSystemRelease);
           
      • RegExp. Another function RegExp can be used to see if a supplied matches as as regular expression, for example:
           Requirements = RegExp("ce02.ific.uv.es", other.GlueCEInfoHostName);
           
    • Gangmatching.
      The previous requirements affected always two entities: the job and the CE. In order to specify requirements involving three entities (i.e., the job, the CE and a SE), the RB uses a special match-making mechanism, called gangmatching. This is supported by some JDL functions: anyMatch, whichMatch, allMatch. For example to ensure that the job runs on a CE with, at least 200 MB of free disk space on a close SE, the following JDL expression can be used:
         Requirements = anyMatch(other.storage.CloseSEs,target.GlueSAStateAvailableSpace > 204800);
         

  • Rank. The choice of the CE where to execute the job, among all the ones satisfying the requirements, is based on the rank of the CE; namely, a quantity expressed as a floating-point number. The CE with the highest rank is the one selected. The user can define the rank with the Rank attribute as a function of the CE attributes. The default definition takes into account the number of CPUs that are free:
       Rank = other.GlueCEStateFreeCPUs;
       
    But other definitions are possible. The next one is a more complex expression:
       Rank = ( other.GlueCEStateWaitingJobs == 0 ? other.GlueCEStateFreeCPUs : -other.GlueCEStateWaitingJobs);
       
    In this case, the number of waiting jobs in a CE is used if this number is not null. The minus sign is used so that the rank decreases as the number of waiting jobs gets higher. If there are not waiting jobs, then the number of free CPUs is used.

Example

As a simple example, a JDL file to say "Hello World" would be:

Executable = "/bin/echo";
Arguments = "Hello World";
StdOutput = "stdout.log";
StdError = "stderr.log";
OutputSandbox = {"stdout.log","stderr.log"};

Authentication and authorization

Once, you are logged in a UI, you have to load your certificate and proceed with the VO registration (for example, the ific's VO, though you can use any supported VO) through the command:

 voms-proxy-init -voms ific

Proxy validity

By default, the proxy will have a 12 hours validity. To extend it up to 24 hours, you can use the command:

 voms-proxy-init -voms ific -valid 24:00

To increase it even more, (by default 7 days), the following command can be used:

 voms-proxy-init -voms ific
 myproxy-init -d -n -s lcg2proxy.ific.uv.es

then, you have to include the attribute MyProxyServer="lcg2proxy.ific.uv.es" to the corresponding JDL file.

Environment variables

To conclude, you may need to set some environment variables (such as the data catalog):

 export LFC_HOST=lfc02.ific.uv.es
 export LFC_HOME=/grid/ific/

Middleware gLite

gLite is the middleware used nowadays in Grid Computing. Developed by an international collaboration within the EGEE project, gLite provides a solid framework to develop applications which benefit from distributing computing and storage.

In that sense, the commands to submit, retrieve and check (and many others issues) jobs à la Grid start with glite-wms-job-*. Here, a summary of the most important commands will be presented though you can use advance commands. All of them are discussed in Workload Management section from the gLite User Guide.

Job Managing with gLite

More information: Job Managing

Data Managing with gLite

More information: Data Managing

Exercises

-- CarlosEscobar - 14 Jun 2010

Edit | WYSIWYG | Attach | PDF | Raw View | Backlinks: Web, All Webs | History: r10 < r9 < r8 < r7 < r6 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback