r29 - 25 Mar 2014 - 17:48:42 - ElenaOliverGarciaYou are here: TWiki >  Main Web  >  TWikiUsers > ElenaOliverGarcia

Mis apuntes

ApuntesPython
ApuntesTwiki


Prueba modificaciones Twiki del User Support


ATLAS Data Processing @ IFIC




I.- Using the Computing and Grid Resources at IFIC

I.1- Getting an account at GOG-IFIC

After getting your AFS account, you may want to register for a GOG (Grupo de Ordenadores para Grid) account to access the computing resources of the IFIC GOG-Farm and the GRID. To do this, read first the GOG usage rules, then fill this application form and sign it; then give it to the IFIC secretariat. Once your application is accepted, you will be given access to a Grid User Interface where you can login using your AFS account and then accessing the Grid resources.


I.2.-Getting your Personal Certificate for the GOG-IFIC

The GOG-Farm at IFIC only allows a Grid access, and then, you need a Digital Certificate (sometimes called a PKI or X509 certificate) that acts as a passport and says who you are (known as Authentication). To obtain your Personal Certificate complete the following steps :

1. Be sure that you have already filled the application form to access the GOG-Farm and that the application has been signed by the Project Manager and the Director of IFIC.

2. Go to this page

(pkIRISGrid CA web)

3. Look for the section "Solicitud de certificado / CSR de Usuario" and select the browser you are using. Your certificate will be generated and saved in the browser, so do not change browser during all the process of asking and retrieving the certificate.

4. On that page :
* Use as 'Identificador IRISGrid', if possible your IFIC e-mail names separated by a "." (dot) character (ex.: Javier.Sanchez).
* In the 'PIN para certificado', write a key for the future access to the system.
* In the 'Clave de Usuario', write a word that you can tell to the RA in the presencial phase, as an additional checking of your identity.

5. Once you have filled the web form on the PKIrisGRID site, you have to show your identity to the authority.
* Take the printed, filled and signed, apllication form titled "SOLICITUD DE CERTIFICADO DE USUARIO PKIRISGRID.VERIFICACION DE IDENTIDAD" to the IFIC-RA. You should give a photocopy of your Identity card (DNI) or passport and show it during the identity check process.
* IFIC-RA: Javier Sanchez, room 008. Experimental building of IFIC.
* IMPORTANT: This identity verification has to be done with the presence of the user candidate itself.

6. Few days later you can download into your browser your certificate from this page. Remember that you have to use the same browser that you have used for the application.


I.3.- Installing your Personal Certificate in your computer

Once you have obtained your Personal Certificate, and want to use it with globus in order to access the Grid Computing Resources, you have to install it on your computer on the specific directory ~/.globus. Your certificate consists of two parts. A public key and a private key. It is very important that you save the private key with the adequate permissions to avoid access to it from other persons. Remember that for more security it is coded with the AFS password that you had when you applied for it. To install your certificate on your computer, complete the following instructions :

1- Backup your certificate from your browser to a temporal directory, let us say ~myusername/temp/, as follows (this example is done with Mozilla Firefox) :

Select in your browser Edit -> Preferences -> Advanced -> View Certificates
Select your new certificate and click on Backup
Save your certificate with a name you choose (for example "MyCertificate") into a directory of your choice, for example ~myusername/temp/. You will be asked for the password of your certificate.

2- Once you have your certificate "MyCertificate.p12" in the p12 format in your ~myusername/temp/ directory, login into a User Interface machine and execute the following script (you can do it on your PC if you have AFS as well) , from the ~myusername/temp/ directory, then follow the instructions (note: type Mycertificate without the .p12 extension) :

myhost:~/temp> ~sanchezj/public/p12toglobus.sh MyCertificate

This will OVERWRITE the files existing in your ~/.globus directory

3- Be sure that all has been done ok, then backup your MyCertificate.p12 file in a safe place and delete it from the ~myusername/temp/ directory.

If you need more help please visit the following page of PkIRISGrid


I.4.- Renewing your Personal Certificate

IMPORTANT: Note that you can renew your grid certificate only during the last month before its expiration. If your certificate expires before you have renewed it, you have to ask for a new certificate, following the procedure that you have done at the first time.

To renew your Personal Certificate, proceed as follows:
* Take the printed, filled and signed, apllication form titled "SOLICITUD DE CERTIFICADO DE USUARIO PKIRISGRID.VERIFICACION DE IDENTIDAD" to the IFIC-RA. You should give a photocopy of your Identity card (DNI) or passport and show it during the identity check process.
* IFIC-RA: Javier Sanchez, room 008. Experimental building of IFIC.
* IMPORTANT: This identity verification has to be done with the presence of the user candidate itself.
* Go to the IRISGrid page. In the section "Usuario", look for the link "Renovar Certificado", click on it and follow the instructions. If you need help click on the link "Ayuda".

Note: Your certificate will be generated and saved in the browser, so do not change browser during all the process of asking and retrieving the certificate.

Few days later you can download your certificate from the following IRISGrid page: Introduce your "Identificador IRISGrid" to download your certificate into your browser. Remember that you have to use the same browser that you have used for your update.

To install your certificate in your ~/.globus directory, complete the instructions described above, in section I.3.


I.5.- Joining the Atlas Virtual Organization

To use the Grid Certificate, you need to join a Virtual Organisation (VO). Being a member of a VO is like having Visas in your passport that say what you can do (known as Authorisation). To join the ATLAS Virtual Organization in the European zone you need to follow the ATLAS VO Registration procedure. The nickname and the email in your VO registration MUST be the same as your CERN account and the email associated to it. When asked, select only the following groups and NO ROLES (unless you really know what you're asking for):

* /atlas
* /atlas/lcg1

These groups are enough for full data access and job handling within the ATLAS VO as a normal user.

IMPORTANT NOTE: User datasets will be required to be begin with "user.nickname", where nickname is the nickname attribute in the atlas VO extension of your grid certificate. Please verify that you have a nickname set for your certificate. If you do not have a nickname attribute, you need to get one by doing the following:

* visit the ATLAS VO Registration page (using a web browser that has your grid certificate installed)
* open the left items list of that page and go to "Edit Personal Info"
* verify that the nickname check box is toggled then click the Search button
* add or modify your nickname (must be the same as your CERN account) and click the Submit button

You will be notified by e-mail about your certificate attribute changes.


I.6.- Re-signing your Atlas-VO membership

When your VO atlas membership will expire, you will receive an Automatic Notification e-mail from ATLAS VOMRS asking you to re-sign the Grid and VO AUPs in order to renew your VO membership. To do it, you will be redirected to the Atlas VO Registration page of the VOMRS Atlas Service to re-sign. Read and Sign.



II.- The Atlas Distributed Data Management

To run Atlas Data Analysis we have first to identify the dataset that we want to use. In order to search datasets on the grid storage resources, we can use one of the web tools provided by the Atlas collaboration for this purpose. We also can use directly the DQ2 Clients.

First of all, it is recommended that we browse over the web page provided by IFIC to check that the Datasets we are looking for are not available at the Spanish Tier-2 Federation storages. We can also access directly the Storage Element at IFIC by using the ls and/or the cp linux commands to list or copy data that are located in the directories under the following spacetokens :

/lustre/ific.uv.es/grid/atlas/atlasdatadisk
/lustre/ific.uv.es/grid/atlas/atlasproddisk
/lustre/ific.uv.es/grid/atlas/atlasgroupdisk
/lustre/ific.uv.es/grid/atlas/atlasuserdisk
/lustre/ific.uv.es/grid/atlas/atlasmcdisk

for example by executing :

$> ls /lustre/ific.uv.es/grid/atlas/atlasproddisk/mc08/AOD/

If the datasets we are looking for are neither at the IFIC Strorage Element nor at the Spanish Tier-2 Federation storages, so we can do one of the following :


II.1.- Searching Datasets on the Grid using the Web Tools

AMI is the ATLAS Metadata Interface Portal Page, that we can use for datasets searching. A Tutorial is provided to help user going through this web, as well as a Rapid Introduction to the AMI Dataset Search Interface, in the form of a FAQ.

The Panda Monitor web site also provide a way of searching datasets. The dataset browser (very slow!) allows browsing of DQ2 datasets based on dataset metadata and site selections. Datasets searches can be done with the search form (with wildcards) or quick search (with the full name, no wildcard). The Panda Database Query Form allows a quick search for datasets.


II.2.- Searching Datasets on the Grid using the DQ2 Clients

All official existing Atlas Datasets stored over the Grid storage resources are registered on Local Catalogs located at each Tier-1 site, and also at ce Central Catalog at CERN. Searching datasets could then be done by requesting a Catalog. For the Spanish cloud the Local Catalog is located at PIC-Barcelona. To interact with the Catalogs we can use the DQ2 Clients tools provided by the Atlas Distributed Data Management Group.

The DQ2 Clients consists of groups of DQ2 Enduser tools and DQ2 Commandline utilities that all users can use to query the catalogs. While the DQ2 Enduser tools can be used for general and common queries, the DQ2 Commandline utilities are expected to be used for advanced uses, and some of these Commandline utilities need a Production Role priviledge to be granted to the user, for example to delete replicas of datasets. Note also that some queries (list datasets for example) can either be executed with a dq2-ls enduser tool or a dq2-list-replicas commandline utility.

In order to use these DQ2 Clients at IFIC we first have to setup the DQ2 environment variables by doing the following :

Login into a User Interface machine :

$> ssh ui00.ific.uv.es   ( or use the short command  $> ssh ui00  if you are login from IFIC )
Execute the following setup script file:
$> source $VO_ATLAS_SW_DIR/ddm/latest/setup.sh
And create a valid voms-proxy as follows, do not use the grid-proxy-init command :
$> voms-proxy-init -voms atlas

Now we are ready to use the DQ2 Clients :

The details on how to use theses DQ2 Clients, are described at the DQ2 Clients How To cern twiki page and the Atlas Distributed Data Management web page.


II.2.a.- The DQ2 Endusers tools :

The most used dq2 commands, known as DQ2 Enduser tools, are described below. Execute a selected command with the -h option to see how to use it; for example $> dq2-ls -h. See also the DQ2 Enduser tools at the Atlas DDM web page.

$> dq2-ls DATASETNAME                    // find a dataset named DATASETNAME
$> dq2-ls DATASETNAME*                    // use of wildcard to find a dataset which name contain the string DATASETNAME
$> dq2-ls -f DATASETNAME                    // list the files in the dataset named DATASETNAME
$> dq2-ls -fp DATASETNAME                    // list the physical filenames in the dataset named DATASETNAME
$> dq2-ls -r DATASETNAME                    // list the replica locations of the dataset named DATASETNAME

$> dq2-get DATASETNAME                              // download a full dataset named DATASETNAME
$> dq2-get -f FILENAME DATASETNAME                              // download a single file named FILENAME from a dataset named DATASETNAME
$> dq2-get -f FILENAME1,FILENAME2,... DATASETNAME                              // download multiple files named FILENAME1, FILENAME2,... from a dataset named DATASETNAME
$> dq2-get -n NUMBEROFFILES DATASETNAME                              // download a sample of n random files from a dataset named DATASETNAME
$> dq2-get -s SITE DATASETNAME                              // download dataset named DATASETNAME from a site named SITE

$> dq2-put -s SOURCEDIRECTORY DATASETNAME                              // create a dataset named DATASET from files that are on my local disk 


II.2.b.- The DQ2 Commandline utilities :

The list of the DQ2 Commandline utilities are presented below. Their names are self explanatory. For more details about its usage, execute a selected command with the -h option; for example $> dq2-list-dataset -h :

$> dq2-check-replica-consistency                              // Refresh completeness information of a dataset replica.
$> dq2-close-dataset                              // Close a dataset.
$> dq2-delete-datasets                              // Delete a dataset.
$> dq2-delete-files                              // Delete files from a dataset.
$> dq2-delete-replicas                              // Delete replicas of a dataset.
$> dq2-delete-subscription                              // Delete a subscription to a site of a dataset.
$> dq2-delete-subscription-container                              
$> dq2-destinations                              // Lists the possible destination sites for subscriptions.
$> dq2-erase                              // Erases a dataset.
$> dq2-freeze-dataset                              // Freezes a dataset.
$> dq2-get-metadata                              // Retrieves metadata information for the dataset.
$> dq2-get-number-files                              // Gets the number of files in a dataset.
$> dq2-get-replica-metadata                              // Get metadata of a dataset replica.
$> dq2-list-dataset                              // List datasets.
$> dq2-list-dataset-by-creationdate                              // List datasets according to their creation date.
$> dq2-list-dataset-replicas                              // List replicas of a dataset.
$> dq2-list-dataset-replicas-container                              
$> dq2-list-datasets-container                              // List datasets in a container.
$> dq2-list-dataset-site                              // List datasets in a site.
$> dq2-list-erased-datasets                              // List all erased datasets.
$> dq2-list-file-replicas                              // List all file replicas.
$> dq2-list-files                              // List all files in a dataset.
$> dq2-list-subscription                              // List all subscriptions.
$> dq2-list-subscription-info                              // List subscription information for a dataset.
$> dq2-list-subscription-site                              // List all subscriptions for a given site.
$> dq2-metadata                              // List all possible metadata values.
$> dq2-ping                              // Checks availability of the DQ2 central services.
$> dq2-register-container                              // Registers a new container.
$> dq2-register-dataset                              // Registers a new dataset.
$> dq2-register-datasets-container                              // Registers new datasets in a container.
$> dq2-register-files                              // Register files into a dataset.
$> dq2-register-location                              // Register a location for a dataset.
$> dq2-register-subscription                              // Register a subscription for a dataset to a site.
$> dq2-register-subscription-container                              
$> dq2-register-version                              // Register a new version for a dataset.
$> dq2-reset-subscription                              // Reset all subscription for a dataset.
$> dq2-reset-subscription-site                              // Reset a subscription for a dataset for a site.
$> dq2-sample                              // Register a new dataset out of a partial copy of an existing dataset.
$> dq2-set-metadata                              
$> dq2-sources                              // List all possible site sources. 


II.3.- Requesting Subscription of Datasets on the Grid

When datasets are found at some storage element on the Grid, we can request for a replica of it that we store on an adequate local storage element. This could be done by using the Subscription Request Form of the Panda Monitor. However, this could be not so straightforward as the user should first be registered and second may need some priviledges.


II.4.- Use of the Storage Space ( and spacetokens ) at IFIC

At the moment there is not yet storage space dedicated to Tier3-users in the Storage Element Lustre at IFIC. The below listed spacetokens, defined by the Atlas experiment collaboration, are available in Lustre and they follow the policy use defined by Atlas :

ATLASMCDISK
ATLASDATADISK
ATLASPRODDISK
ATLASGROUPDISK
ATLASSCRATCHDISK
ATLASLOCALGROUPDISK

Note: Only these spacetokens are usables. All storage spaces out of these spacetokens are not usables. So strictly avoid to use commands like edg-gridftp-mkdir and/or globus-url-copy to create new directories in Lustre and copy from/into them. See the description below on how to use these spacetokens for managing your files/data.

These spacetokens being managed by the Atlas Distributed Data Management (DDM), all datafile transfers involving these spacetokens have to use the lcg-xx and the lfc-xx command tools and the datafiles have to be registered in the Catalog.

Datafiles can be copied to ATLASSCRATCHDISK by users or jobs. Users should know that this is not a permanaent storage space so the datafiles can be deleted centrally as specified by the Atlas policy of DDM. The datafiles copied there have to be registered in the Catalog elsewhere they are deleted.

Datafiles also can be copied to ATLASLOCALGROUPDISK which is to be dedicated to Tier3-users. However, until the Tier3-infrastructure is being working and a policy of use defined (quotas...), users have to use this storage space with care. Very large datafiles stored in this space will be deleted.

The following example describes how to use files/data transfer involving the ATLASSCRATCHDISK spacetoken and the LFC Catalog ( this is also valid when using the ATLASLOCALGROUPDISK spacetoken) :

Logging in a User Interface and get a valid proxy :

$> ssh ui05 ( or ui06 )
$> voms-proxy-init -voms atlas

Configure the environment variables LFC_HOME and LFC_HOST as follows :

$> export LFC_HOME=/grid/atlas/users/myname
$> export LFC_HOST=lfcatlas.pic.es

The variable LFC_HOST is set to indicate the location of the used Catalog and the LFC_HOME is used to complete the LFN (Logical File Name) filename that is used in the Catalog to reference the file that you copy to Lustre.

The following command do this : copy the source file "myFile.txt" from my local directory [file:/`pwd`/myFile.txt] into the (destination) Storage Element Lustre, through the SRM protocol, [-d srmv2.ific.uv.es] in the spacetoken ATLASSCRATCHDISK [-s ATLASSCRATCHDISK] using the relative path/destination-filename "users/myname/myCopiedFile.txt" [-P users/myname/myCopiedFile.txt] and register it in the Catalog with the logical file name (its reference) "myFilenameInCatalog.txt" [-l lfn:myFilenameInCatalog.txt] :

$> lcg-cr --vo atlas -v -s ATLASSCRATCHDISK -P users/myname/myCopiedFile.txt -l lfn:myFilenameInCatalog.txt -d srmv2.ific.uv.es file:/`pwd`/myFile.txt

To check that the file is registered in the Catalog execute the following command :

$> lfc-ls -l

To check that the file has been copied into Lustre execute the following command :

$> ls -l /lustre/ific.uv.es/grid/atlas/atlasscratchdisk/users/myname/

To copy a registered file from Lustre into your local directory using its filename referenced in the Catalog execute the following command :

$> lcg-cp --vo atlas -v lfn:myFilenameInCatalog.txt file:/`pwd`/myFile.txt

To copy a file directly from Lustre into your local directory execute the following command :

$> cp /lustre/ific.uv.es/grid/atlas/atlasscratchdisk/users/myname/myCopiedFile.txt .

To delete a file from Lustre as well as its reference filename in the Catalog execute the following command :

$> lcg-del -a --vo atlas lfn:myFilenameInCatalog.txt

For more details on how to use DDM commands to manipulate Files and Catalog refer to the course/tutorial presented at GRID y e-CIENCIA 2008 course held at IFIC.

Additionally, if you want to know the storage space used/available for a given spacetoken, execute the following command :

$> lcg-stmd -b -e httpg://srmv2.ific.uv.es:8443/srm/managerv2 -s ATLASSCRATCHDISK



III.- The Atlas Software releases installed at IFIC

If we want to know which Atlas Software Releases are installed on the Computing Elements at IFIC, we query the Grid Information System. To do this, login into a User Interface machine, and execute the following command :

$> lcg-infosites --vo atlas tag > releases.tmp

Then edit the file releases.tmp and search for the IFIC Computing Elements, namely lcg2ce.ific.uv.es and ce01.ific.uv.es. You should see an output which looks like the following :

Name of the CE: ce01.ific.uv.es
   VO-atlas-production-12.0.6
   VO-atlas-production-12.0.7
   VO-atlas-production-12.0.8
   VO-atlas-production-14.1.0.1-i686-slc4-gcc34-opt
   VO-atlas-production-14.1.0.2-i686-slc4-gcc34-opt
   VO-atlas-production-14.1.0.3-i686-slc4-gcc34-opt
   VO-atlas-production-14.1.0.4-i686-slc4-gcc34-opt

Name of the CE: lcg2ce.ific.uv.es
   VO-atlas-production-12.0.6
   VO-atlas-production-12.0.7
   VO-atlas-production-12.0.8
   VO-atlas-production-14.1.0.1-i686-slc4-gcc34-opt
   VO-atlas-production-14.1.0.2-i686-slc4-gcc34-opt
   VO-atlas-production-14.1.0.3-i686-slc4-gcc34-opt
   VO-atlas-production-14.1.0.4-i686-slc4-gcc34-opt

The installed releases of the Atlas Software at IFIC can also be seen by visiting the site Atlas Installation Pages and selecting IFIC-LCG2 as Site Name. You can also select the release number of the Atlas Software you are being to check for installation. More posibilities are given to the visitor for search. Note that you need a valid certificate uploaded in your browser in order to access this web site.



IV.- The Data Analysis tools installed at IFIC

The official tool to perfrom the Atlas Distributed Data Analysis on the Grid is Ganga. Detailed information about Ganga can be found at its official site, as well as a user guide.

Atlas User Support is provided by the Hypernews Forum Ganga Users and Developers.



V.- Performing Atlas Data Analysis on Local Computer

At IFIC, the ATLAS software is installed in AFS and can be accessed through a User Interface, so login to one of the User Interfaces

$> ssh ui00

Note: The following example uses release 16.0.2 of the Atlas software. You can find older versions here

1. Preparing your account to use the ATLAS software :

To prepare your account, first you need to setup ATLAS software's environment and general tools.

$> source /lustre/ific.uv.es/sw/atlas/local/setup.sh

To setup Athena first do:

$> cd $HOME
$> mkdir -p AthenaTestArea/16.0.2
$> export AtlasSetup=${VO_ATLAS_SW_DIR}/software/16.0.2/AtlasSetup
$> alias asetup='source $AtlasSetup/scripts/asetup.sh'

The Athena setup is performed by executing the asetup command with a list of options and arguments.

$> asetup <arg1> <arg2>,<arg3> --<option1> --<option2> <value2> --tags=<tag1>,<tag2>,<tag3>

where:

  • <arg1>, <arg2> and <arg3> are arguments, which may be separated by spaces or commas (",").
  • <option1> is an option without value (corresponding to a particular value of a boolean variable).
  • <option2> is an option with value <value2>, which cannot be defaulted (i.e. a value must be supplied).
  • --tags (which has the aliases -t or --tag) can be used to specify a space or comma-separated list of tags (corresponding to command line arguments).

Some commonly used options have single character aliases, in which case they are denoted by a single rather than a double dash (e.g. -t instead of -tags).

Most configuration variables can be specified as arguments, options or tags, although some that have associated values can only be specified as option/value pairs. See AtlasSetup for a full explanation.

The list of available options and arguments can be viewed by specifying either of:

$> asetup -h
$> asetup --help

An simple example of asetup usage at IFIC is the following:

$> asetup 16.0.2 --testarea=$HOME/AthenaTestArea --svnroot=svn+ssh://myCERNuserName@svn.cern.ch/reps/atlasoff --multitest --dbrelease "<latest>"

Notes:

  1. --testarea Sets the location of the test development area.
  2. --svnroot By default the $SVNROOT environment variable is setup by these procedures automatically to be svn+ssh://svn.cern.ch/reps/atlasoff. To check software out from CERN's svn repository identification is required. IFIC user name is used by default in this process. If your CERN and IFIC user names are different you need to put your CERN user name manually like in the example above.
  3. --multitest If you are working with several releases this argument overrides the default structure for test releases, and adds a directory being the name of the release to the path specified by the testarea.
  4. --dbrelease This allows the default DBRelease release to be overridden by setting the $DBRELEASE_OVERRIDE environment variable. In the example above, "<latest>" corresponds to the most recent DBRelease being taken if multiple are installed.

2. Get and compile the User Analysis package :

ATLAS software is divided into packages, and these are managed through the use of a configuration management tool, CMT. This is used to copy ("check out") code from the main ATLAS repository and handle linking and compilation.

Note: Some tips for using CMT can be found at SoftwareDevelopmentWorkbookCmtTips.

Now let us get the User Analysis package and compile it. For that execute the following commands :

$> cd $TestArea
$> pwd
/afs/ific.uv.es/user/.../AthenaTestArea/16.0.2
$> cmt co -r UserAnalysis-00-15-04 PhysicsAnalysis/AnalysisCommon/UserAnalysis
$> cd PhysicsAnalysis/AnalysisCommon/UserAnalysis/cmt/
$> cmt config
$> source setup.sh
$> cmt make

To run the compiled software, get first your AnalysisSkeleton_topOptions.py file by doing the following :


$> cd ../run
$> get_files AnalysisSkeleton_topOptions_AutoConfig.py

Note that, in order to set the AOD data you want to process, you have to edit this file and change line 15 as follows :

jp.AthenaCommonFlags.FilesInput = [ "put_here_the_name_of_the_AOD_you_want_to_process"] 

Note: For 7 TeV? data look at /lustre/ific.uv.es/grid/atlas/atlasdatadisk/data10_7TeV

finally run the following command :

$> athena.py AnalysisSkeleton_topOptions_AutoConfig.py



VI.- Performing Atlas Distributed Data Analysis on the Grid

The official way to perform ATLAS Data Analysis is using the Ganga or the Pathena tool. However, the Grid Resources can also be used directly by the user. In the following let us discuss how to use Ganga for Distributed Analysis in Atlas on the Grid. For a detailed information on how to use the "Ganga Command Line Interpreter (CLI)" see the Working with Ganga document. See also Introduction to GangaAtlas slides.

For the Atlas Distributed Data Analysis using Ganga, visit this site for more details, there are some tutorials for different versions at the bottom.


VI.1.- Setting up and configuring the Ganga Environement at IFIC

When running Ganga for the first time, a .gangarc configuration file will be created in our home directory. We have then to change some configuration parameters in it, accordingly with what we need. To do this let us execute the following commands (Note that we have to login into a User Interface machine and have a valid proxy-certificate if we want to run jobs on the Grid using Ganga) :

$> ssh ui00.ific.uv.es
$> source /afs/ific.uv.es/project/atlas/software/ganga/install/etc/setup-atlas.sh
$> export GANGA_CONFIG_PATH=GangaAtlas/Atlas.ini

if you want another version write it like argument in the end:

$> source  /afs/ific.uv.es/project/atlas/software/ganga/install/etc/setup-atlas.sh 5.5.21

Create your configuration file (.gangarc)

$> ganga -g

Let us answer "yes" to the question asked by Ganga to create the .gangarc configuration file. We leave Ganga (with Ctrl-D) and use our favourite editor to edit the .gangarc file, then do the following changes corresponding to the ATLAS-IFIC environment :

In the section labelled [Athena] add the line

ATLASOutputDatasetLFC = lfcatlas.pic.es
In the section labelled [Configuration] add the line
RUNTIME_PATH = GangaAtlas:GangaPanda:GangaJEM
In the section labelled [LCG] add the line
DefaultLFC = lfcatlas.pic.es
DefaultSE = srmv2.ific.uv.es
VirtualOrganisation = atlas
In the section labelled [defaults_GridProxy] add the line
voms = atlas
In the section labelled [defaults_VomsCommand] add the line
init = voms-proxy-init -voms atlas

The variable ATLASOutputDatasetLFC catalogues your output in the PIC Cloud for ATLASOutputDataset option. The variable RUNTIME_PATH chooses the ATLAS applications. The variables DefaultLFC and DefaultSE define a catalogue and Storage Element where your input file can be save if it is bigger than 10 MBs (the input size maximum on the GRID), and GANGA can make a copy in the job site. The variables voms and init permit to GANGA create the correct grid proxy.

Once these changes done in the .gangarc configuration file, we are ready to use Ganga for the Atlas distributed data analysis.


VI.2.- Running analysis using Athena with Ganga

In the following example we will see how to run a job using the release 16.0.2 of the Atlas software and User Analysis package. If we are using a different software release, we have to make the adequate changes when executing the setup commands.

First of all, login into a User Interface machine and get a valid proxy-certificate :

$> ssh ui00.ific.uv.es
$> voms-proxy-init -voms atlas

As the Athena framework will be used, we have to configure the environment variables to take that into account doing the Athena Setup before Ganga ejecution. Athena Setup configuration variables and environment as follows (It is supposed that the "PhysicsAnalysis/AnalysisCommon/UserAnalysis" Package is installed. Visit section V. to see how to install the package) :

$> source /lustre/ific.uv.es/sw/atlas/local/setup.sh
$> mkdir -p AthenaTestArea/16.0.2
$> export AtlasSetup=${VO_ATLAS_SW_DIR}/software/16.0.2/AtlasSetup
$> alias asetup='source $AtlasSetup/scripts/asetup.sh'
$> asetup 16.0.2 --testarea=$HOME/AthenaTestArea --multitest --dbrelease "<latest>"

IMPORTANT: Note that Ganga should be run from the run/ directory, of the Physics Analysis Package and ganga will recognize your athena package. So, let us do the following from the run directory :

$> cd $TestArea/PhysicsAnalysis/AnalysisCommon/UserAnalysis/run

The next command allows us to use the latest version of Ganga that is installed in our environment.

$> source /afs/ific.uv.es/project/atlas/software/ganga/install/etc/setup-atlas.sh
$> export GANGA_CONFIG_PATH=GangaAtlas/Atlas.ini

Suppose now that we want to run an Athena job (with a montecarlo input dataset), corresponding to the following Ganga-python file configuration named myGangaJob.py, and that this file is located in the run/ directory of our Physics Analysis Package :


# FileName myGangaJob.py #########################

j = Job()
number=str(j.id)
j.name='twiki-Panda-'+number
j.application=Athena()
j.application.option_file=['AnalysisSkeleton_topOptions_AutoConfig.py']
j.application.max_events=-1
j.application.atlas_dbrelease='LATEST'
j.application.prepare(athena_compile=False)
j.splitter=DQ2JobSplitter()
j.splitter.numfiles=1
j.inputdata=DQ2Dataset()
j.inputdata.dataset= ["mc09_7TeV.105200.T1_McAtNlo_Jimmy.merge.AOD.e510_s765_s767_r1302_r1306/"]
#next sentence for small test, comment it for the full analysis:
j.inputdata.number_of_files=1
j.outputdata=DQ2OutputDataset()
j.outputdata.datasetname='user.mynickname.test.ganga.panda.'+number
j.outputdata.location='IFIC-LCG2_SCRATCHDISK'
j.backend=Panda()
j.do_auto_resubmit=True
j.submit()

# End File #########################

The variable 'number' is just to save the id (the ganga identication) of the job for defining a unique output dataset name (a requirement of DQ2). With j.outputdata.location, you can choose the Storage Element for your output datasets according the DDM police. The output dataset is first storaged in the scratchdisk of the site where the subjobs has run ,and then, a Datri request is made to your location. The command j.do_auto_resubmit=True allows an automatic resubmission of the failed subjobs, but this only happens if some 'completed' subjobs existed.

You can see your job status and the stdout file in the Panda Monitor Users (look for your name like in the Grid Certificate).

To make Ganga executing this file do the following :

$> ganga

Now we are inside the Command Line Interpreter (CLI) of Ganga, then we can use the own commands of Ganga. For example, in order to execute our "myGangaJob.py" file we use the execfile() command as follows :

In [1]: execfile('myGangaJob.py')


VI.3.- Some ganga commands for working with grid jobs

Other commands for working with our jobs and monitoring them are:

See all the jobs information:

In [1]: jobs

See the only information from an interval of jobs:

In [1]: jobs.select(1,10)

See the Ganga Status of one job and of a subjob:

In [1]: jobs(10).status

In [1]: jobs(10).subjobs(2).status

See only subjobs with concrete status:

In [1]: jobs(10).subjobs.select(status='failed')

See subjobs number with a concrete status:

In [1]: len(jobs(10).subjobs.select(status='failed'))

See one subjob in the panda monitor:

In [1]: ! firefox $jobs(731).subjobs(4826).backend.url  &

Kill the job:

In [1]: jobs(10).kill()

Remove one job:

In [1]: jobs(10).remove()

See the possible site names for running with Panda backend:

In [1]: r=PandaRequirements()
In [1]: r.list_sites()

-- ElenaOliverGarcia - 13 Apr 2011

Edit | WYSIWYG | Attach | PDF | Raw View | Backlinks: Web, All Webs | History: r29 < r28 < r27 < r26 < r25 | More topic actions
 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback