r18 - 17 Jun 2011 - 12:14:08 - NeXTYou are here: TWiki >  NEXT Web  >  NextSoftware > SoftwareOffline > SoftwareOfflineTutorial

Tutorial

Part 1: the user perspective

Preliminaries

Before you start walking through tutorial, you should do two things:
  • Set up nextoffline, see SoftwareOfflineInstallation
  • Create some area where to write files created by this tutorial. It does not really matter where it is and how it is called (the tutorial should be path-independent), as long as you have write permission in there. Let's assume you are going to write things into a directory called offlinetutorial. Create the following directory structure:
mkdir offlinetutorial
cd offlinetutorial
mkdir data; mkdir data/evt; mkdir data/histo; mkdir data/img
We will assume in the following that the tutorial is run (that is, the ana executable is invoked) from within this directory (offlinetutorial, in the example). Starting with the helium release, this tutorial is now a SRT package that is itself part of the nextoffline release. Have a look at $SRT_PUBLIC_CONTEXT/Tutorial; there you will find the required xml and data files organized in the following directory structure:
  • data/nexus: directory containing two sample files produced by the nexus detector simulation, which we will use as input to the reconstruction. More specifically, these are both bb0nu events simulated at the detector center.
  • data/raw: directory containing one file of NEXT-1-IFIC raw events taken with the GasIFIC? DAQ.
  • xml: directory containing nine job configuration files, one for each job in this tutorial.

Example MC1: Understanding XML configuration data

Have a look at the ExampleMC1.xml file in $SRT_PUBLIC_CONTEXT/Tutorial/xml. You will find that:
  • Two shared libraries are loaded at run-time by the ana executable,  AnaBase and BHEPReader, as specified within the <link> </link> XML tags
  • One default module configuration file from your base release is loaded, the one corresponding to the BHEPReader module, as specified within the <xmlfile> </xmlfile> tags
  • No modules are executed at run-time, since no modules appear within the <job> </job> XML tags
To run the first example, type:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC1.xml -n 1

After the ROOT initialization banner, you will see that the job does not actually do much, terminating with a line of the type:
+- Example 1 sequence (1/0/1 events 0u/0s seconds)
meaning that one event was processed, and that the job took essentially no time... Still, the job successfully loaded two shared libraries. If you tried to remove or comment out the AnaBase library loading, for example by commenting AnaBase, like this:
<!--
AnaBase
-->

(XML comments are just like HTML comments), then you would get an error at run-time, since the BHEPReader library depends upon the AnaBase one. Note that also the order counts: you would get the same error if you were trying to load BHEPReader before AnaBase. It is good practice to load libraries with raw, reconstruction and analysis-level objects (RawBase, RecoBase and AnaBase) before any other library, since many libraries depend upon them. Similarly, we will see later a library called EventFilters, which in fact groups several modules. Since so many modules (with many dependencies) are present there, it is good practice to place that library toward the end of you library link list. Now, let's look at the other bit of information: module configuration files. Type:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC1.xml -n 1 -X DumpExampleMC1?.xml

and  have a look at the output produced, DumpExampleMC1.xml. This is a copy of all the configuration data gathered by the job before execution. You will find that a series of configurations for the BHEPreader have been loaded, eg the ones within the <config> </config> XML tags. While <xmlfile> is useful to load default configurations for modules, you can also define your own configurations within the job control XML file. Try adding to ExampleMC1.xml the following text:

<config name="BHEPReader" version="user"> 
  <param name="BranchID">
    <int> 1 </int> 
    ID of branch where BHEP event is stored. Branch identifiers as follows: 0: MC 1: DetSim 2: DAQ 3: Raw 4: RawAux 5: Cal 6: Reco 7: User 8: Summary 
  </param> 
  <param name="InputFiles">
    <string> "userfile.root" </string> 
    Input filenames 
  </param> 
</config> 

  
If you rerun with the -X option, you will see that your local configuration for BHEPReader has been successfully loaded. In addition to default or entirely local module configurations, a mixture is also possible, for example:

<config name="BHEPReader" version="user2" base="default">
  <param name="InputFiles">
    <string> "userfile2.root" </string>
    Input filenames 
  </param> 
</config>
which tells FMWK that the user2 configuration matches the one of its base configuration default, except for the parameters explicitly listed.

Example MC2: Reading BHEP DSTs

Now let's start processing some events and execute some modules. Have a look at ExampleMC2.xml: not much difference from ExampleMC1.xml, except that now we are executing a single "node" within the job, as specified by the line:
<node module="io::BHEPReader" config="detsim" reco="1" ana="0"/>

In this case, the node is simply a FMWK module (a sequence of modules can be defined as a node, also) Now run a job by typing:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC2.xml -n 100 -i /dev/null:0

The -n option tells how many events to process, while the -i one tells the list of FMWK input files to read from. /dev/null:0 indicates an empty input datastream. This is the case here as we want to read BHEP DSTs and create FMWK events via the BHEPReader module (which is an event generator), so we want to start with empty events. The job above will complain because the $INPUTBHEPFILES environment variable is not defined. Please define it to make it point to a nexus file:
export INPUTBHEPFILES=$SRT_PUBLIC_CONTEXT/Tutorial/data/nexus/nexus_example2.file1.dst.root
or
setenv INPUTBHEPFILES $SRT_PUBLIC_CONTEXT/Tutorial/data/nexus/nexus_example2.file1.dst.root
or, alternatively, you can get rid of the default BHEPReader configuration and specify locally within the job XML file your own configuration for BHEPReader. Note that in addition to environment variables, the InputFiles string parameter also recognizes regular expressions and space-separated multiple files, such that the following three statements are equivalent:

  • <string> $SRT_PUBLIC_CONTEXT/Tutorial/data/nexus/nexus_example2.file1.dst.root $SRT_PUBLIC_CONTEXT/Tutorial/data/nexus/nexus_example2.file2.dst.root </string>
  • <string> $SRT_PUBLIC_CONTEXT/Tutorial/data/nexus/nexus_example2.file*.dst.root </string>
  • <string> $SRT_PUBLIC_CONTEXT/Tutorial/data/nexus/nexus_example2.file[1,2].dst.root </string>
Now the job should succeed, and the Example 2 sequence printed at the end should indicate that 100 events were processed through the "Reco" interface of BHEPReader, and that all of them passed. You will also find a non-zero job time. Now let's discuss some output other than the one on screen. Having not specified the -g 0 option, you may have inadvertently created "histogram files" in the directory you have executed ana from, named histos.root, histos-01.root, etc., one for each job executed. Now you can safely remove all of them! You can also remove the DumpExampleMC1.xml file. Let's now run:
ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC2.xml -n 100 -i /dev/null:0 -o data/evt/fmwk_example2.evt.root -g data/histo/fmwk_example2.histo.root
The -o option specifies the output ROOT file with FMWK event information, and the -g one specifies the output ROOT file with FMWK "histogram" information. The latter is meant for all information that is not event-based (eg, run-based) and can contain any ROOT object, for example histograms, ntuples, trees, etc.

Example MC3: Reading and dumping FMWK event data

Now let's run ExampleMC3.xml with:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC3.xml -n 100 -i data/evt/fmwk_example2.evt.root -g 0

The first thing to note is that now we reading as input not an empty stream, but the FMWK event information produced by the previous example. If you look at the ExampleMC3.xml file, you will note that we are not executing the BHEPReader module: we are reading FMWK events as opposed to BHEP events. You will also note that a single module is being executed also this time, called EventDump. EventDump is a very simple module that dumps on screen event information for each executed event. On screen, you should see text such as:

Dump of run = 0.0 event = 99
evt.Header()
|=======================================================================
evt.MC()
|=======================================================================
evt.DetSim()
|-* anabase::Events[1]
|=======================================================================
evt.DAQ()
|=======================================================================
evt.Raw()
|=======================================================================
evt.RawAux()
|=======================================================================
evt.Cal()
|=======================================================================
evt.Reco()
|=======================================================================
evt.User()
|=======================================================================
evt.Summary()
|=======================================================================
which is telling us that the only object present in the event is a single instance (note the [1]) of the anabase::Event object within the DetSim branch. This object is what the BHEPReader module loaded into memory in Example2, and FMWK wrote it to disk. This is a full copy of the event information stored in the nexus BHEP DST.

Example MC4: Event filters

Now let's start to do some analysis on the nexus DST file. Type:
ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC4.xml -n 100 -i /dev/null:0 -o data/evt/fmwk_example4.evt.root -g data/histo/fmwk_example4.histo.root
If you look at Example4.xml, you will find that in addition to BHEPReader, now also two event filters are executed: TrueVetoFilter and TrueEnergyWindowPreFilter. These modules only implement the Ana event-level phase and not Reco, meaning they do not modify event information. All they do is to select or reject events based on MC truth information contained in anabase::Event. Events rejected by a module won't go through following modules. From the FMWK job summary output on screen, you can see that TrueVetoFilter rejects 1 out of 100 events, while TrueEnergyWindowPreFilter rejects 8 out of 99 events. You can now check (by adding the EventDump module) that no event information has been written by the two filters: only anabase::Event objects are present, one per event, there are just simply fewer overall events in fmwk_example4.evt.root compared to fmwk_example2.evt.root If you now look at the histogram file, fmwk_example4.histo.root, you can see that:
  • the module that used more resources has been the BHEPReader one, as shown in the totTime histogram
  • while the two filters have not modified event information, they do have created histogram information, as can be seen in their respective subfolders in the file    

Example MC5: Adding reco objects into the FMWK event

Let's now run:
ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleMC5.xml -n 100 -i data/evt/fmwk_example4.evt.root -o data/evt/fmwk_example5.evt.root -g data/histo/fmwk_example5.histo.root
This job uses as input the output of the Example4 job, and runs the following modules:
  • MakeVoxels: create "voxels" of energy deposition within chamber by pixelizing the MC true 3D ionization hit information
  • cal::SmearVoxels: perform simple energy smearing for each voxel
  • TrackID: create tracks out of voxels
  • TrackWireCounterFilter: filter to select event topology based on the number of total tracks (of "wire" or "isolated blob" type) and on the number of "wires"
  • EnergyROIFilter: select events based on total deposited energy per event
  • EventDump: dumps on screen event information
By looking at the job summary output, you can see that the two filters reduce the number of events making it to the next module, while MakeVoxels, SmearVoxels and TrackID do not. These three modules populate the event with two new types of objects:
  • recobase::Voxels
  • recobase::Tracks
recobase::Voxel objects are created both by MakeVoxels (in the DetSim branch) and by SmearVoxels (in the Reco branch). recobase::Track objects are created by TrackID (in the Reco branch). We only see events with one recobase::Track instance since we make this requirement in the TrackWireCounterFilter module, and the event dump is placed after this cut. On the other hand, there are typically tens of recobase::Voxel objects per event. Example for one event:
Dump of run = 0.0 event = 98
evt.Header()
|=======================================================================
evt.MC()
|=======================================================================
evt.DetSim()
|-* anabase::Events[1]
|-* ionization
  |-* recobase::Voxels[32]
|=======================================================================
evt.DAQ()
|=======================================================================
evt.Raw()
|=======================================================================
evt.RawAux()
|=======================================================================
evt.Cal()
|=======================================================================
evt.Reco()
|-* ionization
  |-* recobase::Voxels[32]
  |-* recobase::Tracks[1]
|=======================================================================
evt.User()
|=======================================================================
evt.Summary()
|=======================================================================
The other thing to note is that these three object types live underneath the ionization subfolder of the Cal() and Reco() event branches. This is because the nexus simulation output used refers to the "fast" simulation, simulating energy deposition by ionization in the chamber by the various tracks, without drift+light production+detector response. If you look at the produced "histogram" file, you will see that several histograms are produced by the various modules. Have a look at the source code to see what they are. You will also see that MakeVoxels and TrackID are the most CPU-intensive modules in this job.

Example DATA1: reading NEXT-1-IFIC raw data

Let us move now to real data. In the data/raw directory you can find one file containing 32 raw events obtained with the NEXT-1-IFIC chamber and the "GasIFIC" DAQ system. Each event contains 32 channels, 19 connected to anode PMTs and 13 connected to cathode PMTs. Let us run:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleDATA1.xml -n -1 -i $SRT_PUBLIC_CONTEXT/Tutorial/data/raw/alphas_new_trigger_all_channels_with_blanket_002.gasific -g 0

The -n -1 option means all events in the file, 32 in this case. Note that the ExampleDATA1.xml" file specifies only one regular FMWK module in the job definition, =io::EventDump. How are the events correctly read, then? Raw data are read in this case via a special I/O module called RawDataReader. Unlike FMWK general modules, FMWK I/O modules do not have an associated xml file. Rather, they "know" how to read or write in a specific format based on the filename prefix/suffix specifications in the ana command line options -i and -o. In this case, the fact that the input file ends in .gasific" tells =nextoffline to use the GasIFICreadModule=/=GasIFICEventHandle classes in RawDataReader. The event dump for a single event looks like:

Dump of run = 0.0 event = 32
evt.Header()
|=======================================================================
evt.MC()
|=======================================================================
evt.DetSim()
|=======================================================================
evt.DAQ()
|-* recobase::DAQDigits[32]
|=======================================================================
evt.Raw()
|=======================================================================
evt.RawAux()
|=======================================================================
evt.Cal()
|=======================================================================
evt.Reco()
|=======================================================================
evt.User()
|=======================================================================
evt.Summary()
|=======================================================================
We are loading in memory 32 instances of the recobase::DAQDigit class, one for each PMT waveform stored in the raw data.

Example DATA2: NEXT-1-IFIC Alpha Analysis, Pass 1

Now execute the following:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleDATA2.xml -n -1 -i $SRT_PUBLIC_CONTEXT/Tutorial/data/raw/alphas_new_trigger_all_channels_with_blanket_002.gasific -o data/evt/alphas_new_trigger_all_channels_with_blanket_002.pass1.evt.root -g data/histo/alphas_new_trigger_all_channels_with_blanket_002.pass1.histo.root

This is the first pass of the preliminary analysis alpha analysis in NEXT-1-IFIC, as detailed in the CDR. Examples DATA3 and DATA4 below also refer to this analysis. In addition to reading the raw data as in the previous example, we are now executing the following (module names / module configurations):

  • PedestalSubtraction/local: create 32 pedestal-subtracted waveforms (recobase::Digit instances), one for each channel. In addition to subtracting the pedestals, the waveforms are also resampled, from a 10 ns to a 100 ns clock tick width. In order to save space, the original recobase::DAQDigit objects are deleted from the event.
  • SumDigits/local: sum all waveforms from cathode PMTs (there are 13 of them), and store in Raw()/CathodeSum/recobase::Digit
  • flt::EnergyROIFilter/local: select events with a minimum charge of 20.E6 ADC counts for the cathode sum
  • PeakFinder/cathodesum: process the cathode sum waveform to reconstruct recobase::Peak objects present in it
  • flt::PeakTypeCounterFilter/local: select events with one and only one S1-like recobase::Peak, and one and only one alpha-like recobase::Peak
  • flt::DriftTimeFilter/local: select events with a drift time between 40 and 160 us, in order to select events well within the chamber fiducial volume. The drift time is computed as the time difference between the alpha-like peak timing and the S1-like peak timing
  • EventDump/default: dumps on screen event information
As a result, a typical event content looks like the following:
Dump of run = 0.0 event = 26
evt.Header()
|=======================================================================
evt.MC()
|=======================================================================
evt.DetSim()
|=======================================================================
evt.DAQ()
|-* recobase::DAQDigits[0]
|=======================================================================
evt.Raw()
|-* recobase::Digits[32]
|-* CathodeSum
  |-* recobase::Digits[1]
|=======================================================================
evt.RawAux()
|-* CathodeSum
  |-* recobase::Peaks[13]
|=======================================================================
evt.Cal()
|=======================================================================
evt.Reco()
|=======================================================================
evt.User()
|=======================================================================
evt.Summary()
|=======================================================================
Note that the 32 recobase::DAQDigit objects have been deleted (0 such objects are now present). In this particular event, PeakFinder found 13 recobase::Peak objects in the cathode sum waveform. At the dataset level, 11 out of 32 events pass the flt::EnergyROIFilter, 11 out of 11 events pass the flt::PeakTypeCounterFilter filter, and 10 out of 11 events pass the flt::DriftTimeFilter. Some histograms are created by the various modules, to illustrate the event selection. These histograms are also used to create graphics images that are produced at the end of the job. Those images are created only for a quick evaluation of the event selection. For better-looking plots, the typical procedure would be to post-process the FMWK histogram file alphas_new_trigger_all_channels_with_blanket_002.pass1.histo.root (where these histograms have been written), for example via a simple ROT macro.

Example DATA3: NEXT-1-IFIC Alpha Analysis, Pass 2

We can now do further event reconstruction and selection, using as starting point the output of example DATA2:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleDATA3.xml -n -1 -i data/evt/alphas_new_trigger_all_channels_with_blanket_002.pass1.evt.root -o data/evt/alphas_new_trigger_all_channels_with_blanket_002.pass2.evt.root -g data/histo/alphas_new_trigger_all_channels_with_blanket_002.pass2.histo.root

While in example DATA2 we have been using only information from cathode PMTs, here we will use information from anode PMTs. We are executing four FMWK (module names / module configurations):

  • PeakFinder/singlechannel: create recobase::Peak objects out of recobase::Digit ones. Unlike in the previous example, this time we are processing the waveforms from the individual PMTs rather than the cathode PMT waveform.
  • PointReconstruction/local: create a recobase::ChamberHit object per event, containing full 3-dimensional hit information for the nearly point-like energy deposition expected for a alpha decay within the gas. We use information from the individual anode PMT waveforms (more precisely, from the recobase::Peak objects found within them) to reconstruct the x and y position via a simple center of gravity algorithm. The z position is reconstructed using the drift time (measured via the cathode sum waveform, as in the previous example) times the drift velocity. The latter is specified via a module xml parameter.
  • flt::RadiusFilter/local: given the reconstructed x and y, we can compute their quadrature sum to obtain r, the reconstructed distance of the alpha decay from the chamber axis. We require r to be less than 35 mm, in order to select events well within the chamber fiducial volume.
  • EventDump/default: dumps on screen event information
A typical event content looks like:
Dump of run = 0.0 event = 21
evt.Header()
|=======================================================================
evt.MC()
|=======================================================================
evt.DetSim()
|=======================================================================
evt.DAQ()
|-* recobase::DAQDigits[0]
|=======================================================================
evt.Raw()
|-* recobase::Digits[32]
|-* CathodeSum
  |-* recobase::Digits[1]
|=======================================================================
evt.RawAux()
|-* CathodeSum
  |-* recobase::Peaks[41]
|-* SinglePMTs
  |-* recobase::Peaks[575]
|=======================================================================
evt.Cal()
|=======================================================================
evt.Reco()
|-* recobase::ChamberHits[1]
|=======================================================================
evt.User()
|=======================================================================
evt.Summary()
|=======================================================================
Compared to event contents from example DATA2, you can now see an additional recobase::ChamberHit object in the reco() branch. The flt::RadiusFilter module selects 9 out of 10 events, and creates a scatter plot illustrating the effect of the radial cut.

Example DATA4: NEXT-1-IFIC Alpha Analysis, Pass 3

In example DATA3 we completed our selection of alpha candidates. We are now interested in reconstructing the energy and do other analysis on our final sample of 9 events. Let's execute:

ana -x $SRT_PUBLIC_CONTEXT/xml/ExampleDATA4.xml -n -1 -i data/evt/alphas_new_trigger_all_channels_with_blanket_002.pass2.evt.root -o data/evt/alphas_new_trigger_all_channels_with_blanket_002.pass3.evt.root -g data/histo/alphas_new_trigger_all_channels_with_blanket_002.pass3.histo.root

Apart from the usual io::EventDump module, this job only runs one more algorithm: EnergyReconstruction. This module starts from the cathode PMT information to reconstruct the deposited energy per event. This information is stored in the recobase::Energy object:

Dump of run = 0.0 event = 21
evt.Header()
|=======================================================================
evt.MC()
|=======================================================================
evt.DetSim()
|=======================================================================
evt.DAQ()
|-* recobase::DAQDigits[0]
|=======================================================================
evt.Raw()
|-* recobase::Digits[32]
|-* CathodeSum
  |-* recobase::Digits[1]
|=======================================================================
evt.RawAux()
|-* CathodeSum
  |-* recobase::Peaks[41]
|-* SinglePMTs
  |-* recobase::Peaks[575]
|=======================================================================
evt.Cal()
|=======================================================================
evt.Reco()
|-* recobase::ChamberHits[1]
|-* recobase::Energys[4]
|=======================================================================
evt.User()
|=======================================================================
evt.Summary()
|=======================================================================
Why four recobase::Energy objects per event rather than one? This is simply an analysis choice. A few corrections are sequentially applied in the estimation of energy. The four different values store not only the "final" energy value but also "temporary" ones, before all corrections are applied. Here's the meaning of the four different energy measurements:
  • First instance of recobase::Energy: very rough gain intercalibration of cathode PMTs performed. This energy estimator is simply the cathode sum after this rough intercalibration
  • Second instance of recobase::Energy: a correction for electron attachment is also applied. The correction is applied as a function of measured drift time. This is because the longer the drift, the higher is the fraction of ionization charge that is lost along the drift.
  • Third instance of recobase::Energy: a correction for radial effects is also applied. The correction is applied as a function of measured radius. This is because the more peripheral (high radius) the event is, the larger is the effect of photon absorption on the light tube walls, and correspondingly the lower the recontructed energy.
  • Fourth instance of recobase::Energy: currently unused, yielding a value equal to the third instance of recobase::Energy. May be used to correct the energy as a function of the event time extent.
In addition to apply the above-mentioned corrections, the EnergyReconstruction algorithm can also be used to compute them, directly from the data. This is determined depending on the value of a few flags (ComputeRadialCorrection, ComputeDriftCorrection, ComputeWidthCorrection). So, if nothing is known about such corrections a priori, one may have to run this job three times:
  • a first time with all corrections set to zero, the drift time correction is computed at the end of this job
  • a second time with the the drift time correction evaluated in the previous pass, the correct radial correction is computed at the end of this job
  • a third time with the correct drift time and radial corrections. The "final" reconstructed energy per event is evaluated in this step.
Again, a number of histograms, profile histograms and scatter plots (together with the corresponding images) are produced during this job. In this case, some data fits are also performed. See the source code for details.

Part 2: the developer perspective

  • -- MichelSorel - 05 Oct 2010
    toggleopenShow attachmentstogglecloseHide attachments
    Topic attachments
    I Attachment Action Size Date Who Comment
    elsegz offlinetutorial.tar.gz manage 1374.5 K 30 Mar 2010 - 10:18 MichelSorel Tarball with XML job configuration files for tutorial examples, and sample nexus outputs
    Edit | WYSIWYG | Attach | PDF | Raw View | Backlinks: Web, All Webs | History: r18 < r17 < r16 < r15 < r14 | More topic actions
     
    Powered by TWiki
    This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
    Ideas, requests, problems regarding TWiki? Send feedback