An old fart guide to CMS software(CMSSW)
Index
Before starting
I am rewriting this guide for CMSSW. You find
here the old Orca based guide.
CMSSW
| CMSSW is an absolute good. CMSSW is life. All around its margins lies the gulf. |
My purpose is very simple: I would like to learn in 10 minutes how to
do an histogram of a physical quantity. To do this I don't want to follow
postgraduate courses on oo things. So ,first of all, of course, let's
start from CMS Software Page .
A lot of material. Difficult to understand what can be useful to a beginner.
Perhaps this Workbook and these tutorials 6May2006, 6June2006,19July2006.
Well, after an hour I haven't still solved my problem but I have found
wikis,forums, manuals and documents about everything but my simple problem.Perhaps I will need more than 10 minutes! Anyhow there is a name that pops up almost
everywhere CMSSW. So I try to understand what is this CMSSW.
CMSSW is a framework.
Is used everywhere CMS software is needed. Implements a software bus model wherein there
is one executable called cmsRun and many plug-in modules.
Ok you got it! No?
More or less having to do your histogram with a framework is like getting a jet when you
just want a bycicle to go home. Or getting a factory to build hammers when you just need
a simple hammer. The good news is that now you get all this code in one place
CMS software CVS
repository neatly packed in a two level hyerarchy . You can browse this huge amount of code
and also make searches by using a tool called LXR.
You can access it directly on /afs/cern.ch/cms/Releases/CMSSW/.
The people that works at this code tells me how there was a few years ago a dark age when all
the code was split in different realms with strange names like Orca, Cobra, Oscar, Famos, Iguanacms.
Six degrees of complexity : recipe for the impatient user
Here is the CMSSW tutorial from the workbook.:
(on lxplus)
cd /tmp
mkdir $USER
cd $USER
scramv1 project CMSSW CMSSW_0_6_0
cd CMSSW_0_6_0/src
eval `scramv1 runtime -csh`
wget http://cern.ch/jmans/cms/Tutorial_0_5_0.tgz
tar xfz Tutorial_0_5_0.tgz
bash downloadData.sh
scramv1 b
cd Tutorial/Analysis1/test
cmsRun tutorial.cfg
(I got some error messages pointing to the names of two files in tutorial.cfg :the
the problem is that I am using a CMSSW_0_5_0 tutorial with CMSSW_0_6_0;
the format of instructions is of course changed. By comparing the tutorial.cfg
file with an updated file VisDocumentation/VisTutorial/cmssw-reco.cfg I can get the corrections to do:
file:/tmp/HTB_011609.root instead of HTB_011609.root
FileInPath file = "CondFormats/HcalMapping/test/hbho_ring12_tb04.txt" instead of string file = "hbho_ring12_tb04.txt"
end of comment)
I got a ton of printed lines and a brand new "tutorial.root" file.
I can start now "root" by just writing
root tutorial.root
TBrowser b;
.q
(to exit)
The command TBrowser opens a graphic window and I can browse the histograms
in tutorial.root
I am moved, almost crying. I was able to put toghether 7 modules of the
framework in order to analyze some data and get some histograms. I have
also requested the use of other 6 EventSetup modules indicated in the
code with es_module :these are special modules that implement
resources or services available to normal modules. Data necessary to
configure a module are defined as module parameters.
All that by using the file tutorial.cfg . The only problem is that
I don't have the slightest idea which plug-in modules I have to use to create
my histogram and how my configuration file should be written.
I have downloaded some C++ code and compiled it: what was its use? Thanks to the code I downloaded I was able to do my first travel in the CMSSW jet. What I should know to pilot this jet myself? Or should I always depend on some ookid available nearby
to do it for me?
But ,anyway, this example is useful to understand exactly why CMS software is so complex.
If you compare CMS experiment with previous experiments you have the following
additional layers of complexity between you and the data:
- C++ - Do you understand the code copied in the Tutorial
directory, in the example before? Do you understand the code in the CVS repository?
- Objects - Which objects is this application using? How are they implemented
in the code? If I want to access other informations, where should I look?
- GRID - Which objects are persistent? How do I access the events I am interested?
- EDM+CMSSW framework - Ok, this should shield me from the previous layers but I have to learn how I can write configuration files that will
get the data that I need and process them with a cascade of plug-in modules
until I get some output data.
- CVS - You must know this to access and manage the source code.
- SCRAM - Its importance can be seen from the previous example. So what all
these commands do? What is the meaning of the XML commands in the BuildFile files that you find everywhere in the CVS repository?
Know your tools!
From the list in the previous section, it must be evident that now you have
at your disposal a really awful set of new tools. Unfortunately it is
very difficult to use them, since you don't even know what is their name!
And you feel like the apprentice sorcerer that tries to use the spell book
of his master. But anyhow, let's try some spells and see what happens.
(In giving the commands that follow it is important to realize that "scramv1" and
"cvs" commands result depends on the directory where you type them.)
| Spell | Result of the spell | Things to be carefull about before you cast it
|
| Click on Cern Computer Resources | to have information on computers, disk space and other resources available in Cern
|
scramv1 list | List of all public projects and their releases
| You get also the main directory of the release. By looking at .SCRAM/Linux__2.2/ after this directory, you have a list of all packages needed by the project.
|
scramv1 project CMSSW CMSSW_0_6_0 | Create your own private project starting from the public release indicated. | Requires lot of disk space!
|
scramv1 tool list | Lists all tools available in SCRAM | The command must be done in the directories of a private project
|
scramv1 tool info toolname | Lists all information about a given tool | ditto
|
eval `scramv1 runtime -csh` | Makes all the libraries known to SCRAM accessible
|
scramv1 runtime -csh | Prints the result of the previous command without executing it | use -sh if you use sh like shell
|
scramv1 build | The equivalent of make for SCRAM.Executes the Buildfile contained in the directory where you are. | Every Buildfile uses other Build files (Command use) which use other Buildfiles,etc
The command builds an executable (Command bin) and/or libraries (command lib).
Give the option echo_INCLUDE to know which directories are searched for include.
In a Buildfile put a first line with:
INCLUDE+=path/dir to add an arbitrary directory to search for Include Files.
|
scramv1 b distclean | To undo the effect of previous scram build in your local Release Area
|
scramv1 b CXXUSERFLAGS=-g | Compile with g flag for debugging
|
cmscvsroot projectname | Define CVS repository for the project so you can access the source
|
cvs --help-commands | List of CVS commands
|
cvs checkout Modulename or
cvs co Modulename | Get a local copy of the module source in your working directory that you can edit | Will get the head version i.e. the most recent version.This can also be a version not working properly.
|
cvs co -r version Modulename | To checkout a particular(stable) version of the module
|
cvs diff | To know the differences between what is in the repository and what you have in your directory
|
cvs update -r version | To get your local copy in synch with the repository | This can be necessary when the head version is no more working and you want to go back to some stable previous version. The version tag should be normally of the form ORCA_4_5_0
|
cvs update -A | To reset tags | This can be necessary if you get the following message "cvs add: cannot add file on non-branch tag"
|
cvs add Modulename | To add a new module to the repository | Only for developers! Must be followed by a cvs ci command
. The file Root in the CVS directory must contain the following line:
:kserver:cmscvs.cern.ch:/cvs_server/repositories/ORCA
|
cvs ci -m "message" Modulename | To update a module in the repository | Only for developers!
|
cvs remove Modulename | To remove a module from the repository | Only for developers!
|
cvs tag -b release0 | To tag a (stable)version as release0: a new branch is created | Only for developers!
|
cvs status | To know the current status of your working area modules |
|
cvs rtag -b -r release0 release0.1 Modulename | To connect branch release0.1 to branch release0. This can be useful to deal with bugs corrections to a stable version. | Only for developers!
|
cvs log Modulename | To list the versions of a module | Without Modulename will work recursively on the directory
|
cvs log -N -d ">2002-9-1" | more | To list all revisions done after the specified date |
|
cvs diff -r version Modulename | diff with a previous version of a module |
|
klog.krb gzito -cell cern.ch
cvs -d :kserver:cmscvs.cern.ch:/cvs_server/repositories/IGUANACMS
commit -m "message" | to access repository from a node outside Cern |
|
Recovering data from CMS software black hole
What I really need is a document that explains to me where all relevant data is (corresponding to the list of data banks of previous experiments) plus an example program that will show me how I access these data.
The tutorial example will be useful to understand how this can be done.
Let's look at a code snippet from DemoAnalyzer1.cc.
void DemoAnalyzer1::analyze(edm::Event const& e, edm::EventSetup const& iSetup) {
// These declarations create handles to the types of records that you want
// to retrieve from event "e".
//
edm::Handle<HBHERecHitCollection> hbhe_hits;
edm::Handle<HORecHitCollection> ho_hits;
edm::Handle<HcalTBTriggerData> triggerD;
// Pass the handle to the method "getByType", which is used to retrieve
// one and only one instance of the type in question out of event "e". If
// zero or more than one instance exists in the event an exception is thrown.
//
e.getByType(hbhe_hits);
e.getByType(ho_hits);
e.getByType(triggerD);
-
The method
analyze receives a pointer e to the
object edm::Event which contains all event data.
e.getByType(handle to types of event data) will retrieve
the data from the event and store them in a container in memory.
- Containers are provided for each type of event data and can be obtained by using the object
edm::Handle.
Before going to the details on how to extract data from the containers , the code snippet shows the fact that all event data is identified by a type and
a label and ,if you know them you can access the data by using
the methods getByType, getManyByType, getByLabel, getManyByLabel.
The label corresponds to two names but the second name defaults to the null string.
The first name indicates the producer i.e. the module that produced the data. The second name
indicates the data label.
The data label can be discovered in an interactive way by using ROOT on the input file.
In the images produced by root you get a characteristic list of names of event data. Each name
is in the form dataType_producerName_dataName. For example:
SiStripClusterCollection_ThreeThresholdClusterizer_ProcessOne.
Another way to discover interactively these names is to use the Iguana event display (see later) that produces for each event a similar list using the names "Friendly Name" for type,
"Module Label" for producer and "Instance Name" for data label (a little confused?)
.
To summarize: every chunk of event data is identified by 3 names.
The method
getByType uses the first name, getByLabel the second and the third. Note that when you do a request, there can be many blocks of event data
which satisfy the request.
IGUANA : CMSSW visualization
(The situation of CMSSW event visualization is changing rapidly: please use this Cms event visualization (unofficial) FAQ to have an uptodate report.)
All these powerful tools and still no histogram! Perhaps IGUANA can help me. Iguana is used for event and other data visualization . Yes it is an event display and has a simple manual.
CMSSW_0_6_0 Visualisation(IGUANA_6_9_2)
- (on lxplus)
cd /tmp/$USER
- scramv1 project CMSSW CMSSW_0_6_0
- cd CMSSW_0_6_0/src
- eval `scramv1 runtime -csh`
- cmscvsroot CMSSW
- cvs co VisDocumentation/VisTutorial
- cd VisDocumentation/VisTutorial
- vi README.1st
In this file you have informations on how to get the event data.
Each configuration file present in the directory needs different data.
-
wget http://iguanacms.web.cern.ch/iguanacms/dtdigis.root
- iguana --parameter-set cmssw-reco.cfg
- Select CMSSW Reco from "Iguana Setup" window
- Select Next Event and then a part of the tracker
Event Dump
By now I am starting to understand. All event data is connected to the
edm::Event object. Each kind of event data has a "name"
and can be retrieved in a uniform way using that name. Event data is
stored in root files and can be inspected interactively using the program
ROOT. You can even do interactively some simple histogram on the
data. Hey this is something that also a Fortran damaged brain like mine
can understand!As this workbook page Different Ways to Make an Analysis explains, there are 3 different ways to do analysis. 1)with Bare ROOT 2) Framework-Lite (FWLite)mode 3) using the full CMSSW framework with an EDAnalyzer(i.e. you can go home 1)using a bycicle 2)a car 3)a jet ). It is really a big relief to know that I don't need to learn the full CMSSW framework to do my histogram! But before I start inspecting root files I have to remember
that I must:
cd CMSSW_0_6_0/src
eval `scramv1 runtime -csh`
this is important in order to get the right version of the ROOT program.
A simple find in the code repository :
find /afs/cern.ch/cms/Releases/CMSSW/CMSSW_0_6_0/src/ -name "*.root"
will give me a list of root files used to test the code.The directory Configuration/Applications/data
seems to contain a lot of those data files produced presumably by the configuration files
in the directory.
By examining a root file in SimTracker/SiPixelDigitizer I discover the name and format of pixel tracker simhit data. Nice!
Other files to inspect can be found by looking VisDocumentation/VisTutorial/test/README.1st. By examining the events in dtdigis.root I discover the name and format of digis in Muon Detector.
Now a file containing 5 almost complete events.
So the proverbial good news is that data format can be easily
inspected interactively by looking at events with root. The bad news is that data format changes with almost any new release! So how to cope with this awful situation? The files in Configuration/Applications/data will help us.
Suppose that you are at release CMSSW_1_1_0_pre2 and want to check what is the format now: easily done!
cd CMSSW_1_1_0_pre2/src
eval `scramv1 runtime -csh`
cmscvsroot CMSSW
cvs login
cvs co -r CMSSW_1_1_0_pre2 Configuration/Applications/
cd Configuration/Applications/data/
cmsRun -p sim_rec_10muons_1-10GeV.cfg
After this I get a root file with 3 events produced with this release
that I can inspect.
Where are all these petabytes of events?
CMS events are like the aliens referred by Fermi in his famous sentence: So, where is everybody?
Waiting for the real ones, there are huge deposits of simulated events. But where? Somewhere on the grid. The grid what? Anyhow let's try to start with
this Release Validation samples . After a few lines I was able to collect the following self explanatory acronyms :crab, dbs, phedex,lfn,pfn.
I skip these and go direcly to what seems to be a list of list of files with these strange names:
/store/unmerged/RelVal/2006/10/1/RelVal102CJets50-120/GEN-SIM-DIGI-RECO/0006/76AC6FC5-F250-DB11-BBDA-000E0C3F0D36.root
.
I am informed that I can use directly this name in the configuration file in this way:
source = PoolSource {
untracked vstring fileNames = {'/store/unmerged/RelVal/2006/7/24/RelVal081Higgs-ZZ-4Mu/GEN-SIM-DIGI-RECO/0006/44E28C85-8D1B-DB11-9EEC-000E0C3EFB43.root'}
}
I try this with my Iguana cfg file and it seems to work but only running on a Cern computer.
So where are they ...?
Reading again the helpful page I learn that I am using the LFN of the event dataset : L seems to be for Logical , so it is a kind of generic name of the data set. This has to be confronted with the PFN (where P stays for Physical:I am getting smart at these matters) that says where the events are physically stored. The answer from the same page is (for CERN):
rfio:/castor/cern.ch/cms/LFN
i.e. you must add the string rfio:/castor/cern.ch/cms/ before the LFN . So I am starting to understand.
All these petabytes of events require Cern to use a huge mass storage
called CASTOR where data
is accessible through a protocal called rfio.The files in Castor
can be manipulated by commands similar to those I use on my Linux box
but with an additional rf before:
rfdir
rfcp
rfrm
So the command rfdir /castor/cern.ch/cms/
will show me the highest
cms directory in Castor.
Now I have gigabytes of available space on my portable. Why not copy a file locally? Since I haven't access to castor from my portable I have to do it in
two steps(on lxplus):
cd /tmp
rfcp /castor/cern.ch/cms/store/unmerged/RelVal/2006/10/1/RelVal102BJets50-120/GEN-SIM-DIGI-RECO/0006/E4D2DBBA-F250-DB11-98C6-000E0C3F0935.root .
scp E4D2DBBA-F250-DB11-98C6-000E0C3F0935.root zito@pcmennea.ba.infn.it:/data/
After ten minutes the events are on my portable ready to be inspected with Iguana or root!
Exploring castor I discover a /castor/cern.ch/cms/store/RelVal/2006/12/16/ directory with subdirectories that start with /RelVal120: so these are CMSSW_1_2_0 events! I copy another file.
rfio and castor are only two possible protocol/store on the grid. Another is
dcache. Dcache store of grid datasets can be "explored" by using the command:
ls /pnfs/cmsfarm1.ba.infn.it/data/cms/phedex/LNF
The copy of dataset to local store can be done in this case using the command dccp instead of rfcp.
In fact the whole story seems to be a lot more interesting. Datasets on the grid have a kind of generic name LNF that you can use in your configuration
file . The framework will take care to get the copy of the dataset nearest
to you. Which copy is used and by which protocol (rfio,dcache,etc) should be
transparent to the user.
For example I got a post to hypernews which says:
Reco output will be registered with datasetpaths like
/TAC-TIBTOB-120-DAQ-EDM/RECO/CMSSW_1_3_0_pre6-DIGI-RECO-Run-00007287-SliceTest
To know the complete LNF I go to this service or this service selecting "MCGlobal/Writer".
Although the use of these two services may seem confusing and slow it isn't very difficult to get tons of LNF of dataset of reconstructed data like:
/store/data/2007/4/6/TAC-SliceTIBTOBTEC-120-DAQ-EDM-CMSSW_1_3_0_pre6-DIGI-RECO-Run-0007282/DIGI-RECO/0000/02058003-08E5-DB11-983F-000E0C3F0614.root
You can use in your cfg file more LNF's separated by commas.
Datasets are grouped in run (7282 in this case) and for each dataset you know the number of events(71).
Then I try the LNF in my cfg file. If I got an error it could depend on many things:- The computer you use has no access to the Grid (in this case you can either call an expert to make your computer directly connected to the Grid or do the trick previously explained to do a local copy on your computer).
- The dataset is corrupted, perhaps empty . A check using commands like "rfdir" (for rfio protocol) or "ls" for dcache protocol should sort out the problem.
- The dataset is OK but some event (perhaps the first one) is corrupted. In this case try to skip a few event.
To conclude this story we must mention that it is suggested to use something
called CRAB to analyze these data. Also it seems that a certain ProdAgent appears everywhere the grid is used. But these will be the subjects of future posts.
Where is my database?
The normal input-processing-output schema is translated in the configuration file in the following : source,es_source-module,es_module-output.
This means more or less that there are two kind of processing modules: normal modules that process event data defined in a block source; event setup modules that implement "services" and consume data provided by block es_source. This block defines the database of "non event data" used by the program.
Note that normal modules don't access directly the database ; instead they rely
on these event setup modules or services to access this data.
So now the one million dollars question: where is this database? Looking at the cfg file I get lines like:
es_source = PoolDBESSource {
VPSet toGet = {{ string record = "SiStripFedCablingRcd" string tag = "Si
StripCabling_TIF_v1" }
,{ string record = "SiStripPedestalsRcd" string tag = "SiStripPedNoise_TIF_v1_p"}
,{ string record = "SiStripNoisesRcd" string tag = "SiStripPedNoise_TIF_v1_n"}
}
bool loadAll = true
#WithFrontier untracked bool siteLocalConfig = true
string connect = "oracle://orcon/CMS_COND_STRIP"
untracked string catalog = "relationalcatalog_oracle://orcon/CMS_COND_GENERAL"
string timetype = "runnumber"
untracked uint32 messagelevel = 0
untracked bool loadBlobStreamer = true
untracked uint32 authenticationMethod = 1
}
or
es_source = PoolDBESSource {
VPSet toGet = {{ string record = "SiStripFedCablingRcd" string tag = "SiStripCabling_TIF_v1" }
,{ string record = "SiStripPedestalsRcd" string tag = "SiStripPedNoise_TIF_v1_p"}
,{ string record = "SiStripNoisesRcd" string tag = "SiStripPedNoise_TIF_v1_n"}
}
untracked bool siteLocalConfig = true
string connect = "frontier://cms_conditions_data/CMS_COND_STRIP"
string timetype = "runnumber"
PSet DBParameters ={
untracked string authenticationPath = ""
untracked bool loadBlobStreamer = true
}
}
It seems that the same data can be accessed in two ways: the oracle/orcon way
and the "frontier" way. What ?!?
Let's try to understand: the Grid or better the LCG (LHC Computing Grid) is hyerarchy of one T0 central node, a few T1 regional nodes and then many T2 and T3 lesser nodes. These data should be supplied to all of them. Fortunately only T0 or Cern will update (for now) the database,
all others nodes need only a read only copy of the database. So the "Database"
is in Cern at the central node on a cluster of Oracle database server (Yes Oracle the famous database company). This central database can be accessed directly from Cern but also from T1 nodes where it is replicated (always by Oracle)
authomatically.
But the same database can be accessed with another more
simple "mechanism" the "frontier way" that uses a web protocol with url
requests and xml answer (essentially you query the database sending an URL
to the frontier Web server and you get the answer as a XML file). This way to
access can be used of course from everywhere (we are on the Web!) but in order
to be efficient and fast is used by using squids (!?!). Not the gentle marine creatures but the local web server used to store locally (cache they call it) the response to queries. In this way they speed a lot access to central
DB. That's the end of our story populated by oracles, squids, frontiers and modules.
Standalone Installation of CMSSW with apt: Post Tenebras Lux!
I had a lot of problems in the past installing CMS software on my computer.
The thing that really drove me crazy was that after doing a lot of work installing the latest release, you had to redo it again for the new release after a
few days. I have used the apt installer and was especially pleased from the
fact that I could install the newest release with four commands described
in the manual:
eval `$VO_CMS_SW_DIR/aptinstaller.sh config -path $VO_CMS_SW_DIR -csh`
apt-get update
apt-cache search cmssw
apt-get install cms+cmssw+CMSSW_0_8_0
I had misty eyes when I saw (after a few minutes) the "Done" printed and I
could immediately test the new release on my computer.
That's progress!
To conclude this topic I must mention here how you get rid of old installations when your computer get completely full. Unfortunately this isn't mentioned in the manual. If you try the "obvious":
apt-get remove cms+cmssw+CMSSW_1_2_0_pre4 you get the not so obvious message:
Package cms+cmssw+CMSSW_1_2_0_pre4 is not installed, so not removed
.
So I am trying this quick fix:
scramv1 remove CMSSW CMSSW_1_2_0_pre4
rm -rf /cms/slc3_ia32_gcc323/cms/cmssw/CMSSW_1_2_0_pre4
As you see in this case, programmers always follow the golden rule of not making
something perfect to avoid the envy of the Gods!
The dance of tags between Releases
Between one Release and the following in CMSSW develops an extraordinary dance
of tags. Tags indicate a (temporary) version of a package ready to enter the
next Release. Now the problem is the following. If I want to test my package
for the next release, almost certainly my changes depend on changes done
from other people. This means that I can't test my package alone but I must
download a set of modified other packages (called in short tags). In fact
this isn't really sufficient, because probably to use those tags, I must
modify my cfg file. This is what I call the Dance of Tags between releases.
Some benevolent colleague sends you a monstruos list of new tags necessary
to test a new feature.Then you commit the tags one by one. The command:
showtags
will give you information about what new tags you have.
Then you try "scramv1 b" hoping for the best. Normally this should work from
the main "src" directory by the magic of "BuildFile". After a lot of time
a bunch of new libraries , plugins and stand-alone programs are built in
a cache "lib/slc4_ia32_gcc345/".
If you don't get any error, then you can try your (updated) cfg file in
the appropriate directory. Unfortunately you start getting strange messages
of missing plugins or "pure virtual method called".
These are almost always due to the fact that the complex machine hasn't worked
perfectly and now you are trying to use toghether new and old code.
At this point you start becoming desperate and you begin using
strange commands like:
scramv1 b -r
that seems to work like when you bang an hammer on
a malfunctioning machine. Cross you fingers and try again "scramv1 b" perhaps
from another directory.
Getting information from HyperNews the collective memory of CMS
HyperNews CMS Forums is fast becoming the collective memory of CMS with its hundredths of messages posted daily
to many forums.
What makes it so useful, is the possibility to search all these messages. If your question is not answered by a simple search then you can post your problem to the appropriate forum contacting the experts and other CMS interested people.
I would like to test how successfull is this tool. I have the following problem:the apt installer stops with the following error message:
E: Sub-process /home/zito/cms//bin/rpm-wrapper returned an error code (79)
I do a search of words on this sentence on HyperNews and discover that the
Forum Software Distribution Tools deals with problems like this.
But there is no post that seems to answer my problem. Apparently no one
has had before this error code (79).
So I make a post.After two hours I get the answer. Not so bad!
Online(Almost)
Proper data taking will start only in 2007, but we have to get ready! So I have
started experimenting with what online access to CMS data could be. I am interested
in tracker monitoring so the proper place for me to get data is the so called
Filter Farm. A kind of computer cluster of Filter Units that make
possible access to raw data to be used for data quality monitoring(DQM).
Filter Farm, FU, DQM a whole new bunch of acronyms to learn.
However , considering the complexity behind it, Physics and Data Quality Monitoring infrastructure is really simple. It is based on a three tiered architecture with a Collector that receives data from many hardware and
software Sources and makes them available to many Clients.
Collector, Source (this should be a FU during data taking) and Client can run
on different computers.The basic unit handled is a Monitoring Element (which corresponds more or less to a single online histogram). Monitoring Elements
are served by sources in a tree: the Client will build also its tree of ME by
registering those ME in which is interested.Easy piece...
Now what kind of software you run on a DQM application? Exactly the same that you run offline.
CMSSW_1_1_0_pre2
- scramv1 project CMSSW CMSSW_1_1_0_pre2
- cd CMSSW_1_1_0_pre2/src
- eval scramv1 runtime -csh
- cmscvsroot CMSSW
- cvs login (answer 98passwd)
- cvs co -r CMSSW_1_1_0_pre2 DQMServices/Daemon/
- xterm -bg pink &
- cd DQMServices/Daemon/test/data/
- xterm -bg green &
- DQMCollector (give in pink xterm to start Collector)
(The collector should start)
- cmsRun dqm_source_example.cfg (give in green xterm to start Source)
Wait until you see a line printed for each event.
- cd ../../../..
- iguana -is "NTuple browser"
- Click ok on the windows that request the Collector adress "localhost:9090"
now you should get a message that the Collector was found.
- Use the iguana tree on the left to subscribe and display Monitoring Elements
Iguana in this case acts as a client.
Tricks
These are small recipes learnt from experts by asking around or just by looking experts working on a terminal.
Access interactively from lxplus with root a file known with its LFN (stored on the grid at Cern) | root rfio:/castor/cern.ch/cms/<LFN>
|
| Compile a package in such a way that you can inspect the code during execution with gdb | You have to download the package from the repository , add a line in a BuildFile and recompile it for example:
cvs co -r CMSSW_2_0_7 VisReco/VisCustomTracker
cd VisReco/VisCustomTracker
vi BuildFile
(add after use commands, following line )
<flags CXXFLAGS="-O0 -g3 -fno-inline">
scramv1 b
|
| Copy or select a few events from an input dataset to an output dataset | Use the following configuration file:
process copyfile = {
untracked PSet maxEvents = {untracked int32 input = 3}
source = PoolSource {
untracked vstring fileNames = {'file:/tmp/30525A21-BD8E-DC11-A52A-000423D9939C.root'}
}
module output = PoolOutputModule {
untracked string fileName = "Simevents.root"
}
endpath copydata = {output}
}
| Display the first events of a RECO dataset on castor | /afs/cern.ch/cms/fireworks/cmsShow21/cmsShow rfio:/castor/cern.ch/cms/store/data/Commissioning08/Cosmics/RECO/CRUZET4_v1/000/058/733/00132A18-6D72-DD11-A8D4-0030487C608C.root
| | Find the dataset of a specific run in Castor | In dbs give the search "find run where run = 62938"
| | How to test a cfg file without events (for example to check in iguana only the detector display) |
Comment in the cfg file the lines connected to event input and replace them with:
source = EmptySource {untracked int32 maxEvents = 2}
| | How to try to understand if a Frontier connection is working | Before running "cmsRun" with es_source = PoolDBESSource define:
setenv FRONTIER_LOG_LEVEL debug
setenv FRONTIER_LOG_FILE /tmp/frontier.log
| In the file defined you will get the complete dialog between the program and the frontier database servers
| | How to avoid that a cfg file crashes because of a "ProductNotFoundError" | Insert in the configuration file the following lines;
untracked PSet options = {
vstring SkipEvent = {"ProductNotFound"}
}
| more informations here
| | How to check a configuration file for Iguana and the input dataset | Insert in the configuration file the following line at the beginning:service = Tracer {} Now run using the command line cmsRun -p myconfigfile | Should have a report of errors in the cfg and the events.
| | How to check in Iguana if a bug depends on input data | Run iguana and without selecting nothing to display , check "auto event" | Iguana will call the method "OnNewEvent()" for each twig. If you obtain a crash , you know that its cause is
reading and loading data in event and not displaying it.
| | How to check that you have the right version of software to run a CMSSW release | toolchecker.pl |
| | How to check flags and libraries used to build your CMSSW program | scramv1 b -v |
| | How to create a few events to test the corrections to be done to the last integration build |
cmsDriver.py TTbar.cfi -s GEN,SIM,DIGI,L1,DIGI2RAW -n 3 --eventcontent FEVTSIM --conditions FrontierConditions_GlobalTag,IDEAL_V1::All --relval 10000,100 --datatier 'GEN-SIM-DIGI-RAW' --dump_python
cmsRun TTbar_cfi__GEN_SIM_DIGI_L1_DIGI2RAW.py
cmsDriver.py myreco -s RAW2DIGI,RECO,POSTRECO -n 3 --filein file:TTbar_cfi__GEN_SIM_DIGI_L1_DIGI2RAW.root --eventcontent RECOSIM --conditions FrontierConditions_GlobalTag,IDEAL_V1::All --dump_python
cmsRun myreco__RAW2DIGI_RECO_POSTRECO.py
| Look in the release Integration hypernews for more informations.
| | How to know the content of a event root file | EdmDumpEventContent filename |
If the file is on the grid at cern (i.e. filename starts with /store/... then give as filename "rfio:/castor/cern.ch/cms/store..."
| | Find memory leaks in your program | valgrind --leak-check=full cmsRun -p mycfg
| | What to do if you are stuck in compiling the code with scramv1 b and you are not able to get your mods included in the program | Go back in the CMSSW_x_y_z/src directory and give the commands:
scramv1 b distclean
scramv1 b
| | To display with iguana the events in a root file without using a cfg file | iguana filename For example:
iguana rfio:/castor/cern.ch/cms/store/relval/2008/4/28/RelVal-RelValTTbar-1209247429-IDEAL_V1-2nd/0001/426C1A24-0615-DD11-9666-001D09F251CC.root | iguana will create automatically a cfg file with the name iguana-<time>.cfg
| | To allow the printout from iguana |
env LOG=stderr iguana cfgfile
or (to capture log messages in a file)
env LOG=stderr iguana cfgfile > & log.log
| | How to list all tags in fronter db | cmscond_list_iov -c frontier://cmsfrontier.cern.ch:8000/FrontierInt/cms_cond_strip -a | Look here for more informations.
| |
The eternal sunshine of the spotless tutorial
With each new tutorial there is hope in me that it will at last reveal all secrets of CMS software. But after the usually brilliant presentation without a glitch, the delusion comes in the next days. When you try to repeat what you learned
, you start seeing things falling apart. After a month the tutorial material
is completely useless. What happened? The CMS software is so complex and in addition , changes so swiftly, that the only way to create a good tutorial is to
assemble the day before a "simplified platform" that will last only for few days. If the tutorial authors had to be completely frank about the subject, then
they would say: look guys this is very complex, everything changes, there is
no hope for you newcomer but to ask the expert to install everything for you.
- Let some expert buy your computer with the right hardware
- Ask some other expert to install the right operating system
- Now you should have someone load the latest CMS software
- Get from the people that know about your problem set up and give to you
the right configuration file
myconf.cfg
- Ask to the grid experts where the events are and the appropriate cards in the configuration file. Let them set up your computer in such a way that it can access the data.
After this ordeal is over , yes it is as simple as writing cmsRun -p myconf.cfg but don't expect it to work for more than a few days. Than you have to restart again with the ordeal or follow a new tutorial.
Python too!
Although I still dream in Fortran, I love new languages like Python. They are powerful and modern like C++ but in the same time, very simple to learn and use.
So the idea of having in CMS the possibility to use this language to write
configuration files looks promising. The same can be said for the possibility
to use pyRoot (which is python integrated in Root) to do analysis in interactive way. Imagine the possibility to create interactively any class present in CMSSW and explore its use by using its methods.
Unfortunately integrating CMSSW in python I understand that shouldn't be so
easy. All .cfi files now existing in data directories (configuration file fragments built to be included in other cfg files) must be duplicated as _cfi.py files in python directories.
Also there should exist a kind of dictionary that should describe to python the
actual implementation of each CMSSW C++ class. Once this is done you should be
able to start playing with python.
In the end a python cfg file looks very similar to the original and there is
no hint that we have gained something from the change. Anyhow a python cfg file is used exactly in the same way as a normal cfg file, i.e: cmsRun cfgfile_cfg.py.
If you want to use interactively python to analyze a root file than you have to give the following commands:
cvs co PhysicsTools/PythonAnalysis/
setenv PYTHONPATH $CMSSW_BASE/src/PhysicsTools/PythonAnalysis/python:$PYTHONPATH
cd PhysicsTools/PythonAnalysis/examples
(copy in this directory your root file: in this case TTbar.root)
python
b = TBrowser()
(the root browser opens: this is important to know what collections and informations for each collection we have)
events = EventTree("TTbar.root")
for event in events:
tracks = event.generalTracks
print len(tracks)
for track in tracks:
print track.pt()
(return)
CTRL_D (to exit from python)
I am delighted. I was able width a few lines to loop on all events knowing how many
generalTracks are for each event, and then to loop on tracks printing their pt. The only problem I had was to understand exactly what string to put in tracks=event.generalTracks to get generalTracks . The examples present in the directory ,although useful, didn't make the matter clear. According to the documentation you should either give a small "alias" or a very complex complete "branch name".
At this point, miracles of python, I discovered this little recipe to get all
"aliases"and branch names in event.
for alias in events.getListOfAliases():
alias.Print()
As I said I love Python and its programmers. They always care that everything
is simple.
Taming the monster:refactoring with less dependencies
Being used to procedural programming ("a la Fortran") I see a program as a machine
that does some task. If the task is done well and if I understand the way it is done enough that I can change the code to do some modification, I don't care
too much on how the program does it. It seems that with object oriented
programming the things aren't really like this. You usually don't build a program
to solve some specific problem, but you create a framework i.e. a collection of classes that try to solve a class of problems.
Doing this the things can get very wrong because of the complexity
of the relationship between classes (the dependencies).
It seems that is just this that went wrong with the first version
of CMS software (Orca,Cobra,etc...). The software became like the
mythical monster Hydra with nine heads. A tangle of classes.
The only way out was to restart from scratch (CMSSW) this time trying
to minimize dependencies. This was really a refactoring of the
software since most of the old code could be reused.
Just to see where the problem is just look at this picture. The dependencies are the black arrows.
The CMSSW is now at least a single headed monster and each day hundreds of changes
are done to the code. Then when is night in Europa the mithical nightly build
begins. The CMS people try to see if the tiger is still tamed by compiling
everything. After a few hours comes the verdict. Often everything is ok and it
seems like a miracle: the tiger is still in the cage. But every few weeks
a disaster happens and all packages get compilation or link errors. The following morning panic spreads in the collaboration as individuals read the nightly
response. The tiger is out. Frenetic work starts to get it tamed again.
Happy End?
This story has no happy end.There is no easy way to start with CMSSW. Even
developers are continuosly challenged by new changes and have to struggle
to get their code working. There is no royal way to CMS software.
Even with a tool like Iguana that should be very simple to use the truth can be
summarized by these conclusions that I presented in a tutorial session:
It is easy to visualize CMS detector and data provided that ...
You have five experts around!
- An hardware expert to buy for you the right pc , the right graphics card,..
- A linux expert to install on this computer the right linux distribution,the drivers,etc
- A CMS software expert to install the four or five CMSSW releases necessary to see all data available
- A configuration file expert to handle you the right cfg file for your data and CMSSW release
- A data/grid expert to give you the right coordinates on the grid of the data you need to visualize
The CMSSW situation reminds me of what happened when IBM introduced a new machine with the infamous
Job Control Language . The new machine (IBM 360) was a big success but no one really understood the JCL. The solution to this problem
by us the old farts , was:get from a colleague a working deck of cards and you would use it again
and again without trying to understand what was the meaning of such garbage
like //GO SYSYN DD *. So,if you want to do quickly your plot,
get a working cfg file from your expert colleague and don't try to understand
it but only hope that it would still work with the next release. Otherwise
you have to bother your colleague again.
CMSSW is a big success but it isn't for the faint of heart. But if the
language used in the cfg files is difficult and the way it works isn't
easily understood, you must consider that it has to deal with a very,very
complex environment.
The last word
- There are in Cms (or any big organization) two kinds of problems : real problems and management problems.
- Of course management problems are also real problems; but seen from a programmer point of view they are just a pain in the ass.
- Normally management problems appear as frequent reorganizations in groups
and deasappearing and newly appearing acronyms.
- For example in a visualization presentation in addition to normal visualization acronyms like Qt, GUI, etc you get three new acronyms DPG/POG/AG (or PAG) then some strange references to these acronyms:
- ... which should be easily extended by DPG/POG/PAG
- ...involve more users from DPG/POG/PAG
- ...in coordination with POG/PAG
- DPG/POG/AG involvement is crucial
- It is refreshing to see this page for newcomers. I would like to be present when the newcomer looks at the page for the first time. He/She
will run away in despair! A page for newcomers containing only a long list
of documents without almost any explanation is very helpful! I wonder what the page for experts would contain?
- (a citation of CMS techno babble:
The prototype mentioned below is tagged for tonight's nightly.
The mutex for the lock is in FWCore/Utilites and the service that
uses it is in FWCore/Services. The service can be used with the
PoolSource to synchronize the threads. To do that all you need to do is
add:
service = LockService{}
to your cfg file (and of course make sure that DQMServices are locking
on the same mutex).
- (a citation of CMSSW jargon)
:
during CRUZET4 processing we had troubles again because of ESProducts
taken in beginJob.
ESProducts can be IOV dependent but beginJob() has no knowledge of the
"current run" (so you get garbage and crashes).
Please check your code and remove access to ESProducts in beginJob
unless you are 100% sure that it is safe (if you have any doubt please
ask). Tags are expected as soon as possible (tracker and egamma
already provided some fixes)
You can simply move the "get" of the ESHandle into the beginRun (if
you are sue your IOV is only run based and not time based) or directly
in the "produce" method of you producer (if you need to do time
expensive operation on the retrieved object you can check the
cacheIdentifier before retrieving or write an additional ES producer
.
==================== This part not yet updated to CMSSW =========================================
Recovering data from CMS software black hole 2: containers and iterators
Hey,we have managed to run THREE sample programs correctly!! It's time to
get professional about it, and start entering in the wonderful intricacies of
C++ and Carf concerning containers(i.e. collections of objects) and iterators.
First let's look at three snippets of code from the three examples:
- List of pile_up events
// And now the event numbers of all pileup events
cout << "--- PileUp events are: " ;
G3EventProxy::pu_range PUrange = ev->pileups();
for (G3EventProxy::pu_iterator ipu = PUrange.first; ipu != PUrange.second; ipu++) {
PUeventsUsed++; // count them
if (ipu != PUrange.first) {
cout << ", ";
}
cout << (*ipu).id().eventInRun();
}
cout << endl;
- List of Calorimeter Towers
RecItr<EcalPlusHcalTower> MyCaloTower(ev->recEvent());
/* Print Run and Event number and fill them to our Ntuple */
cout << "Run #" << ev->simTrigger()->id().runNumber() << "; ";
cout << "Event #" << ev->simTrigger()->id().eventInRun() << endl;
UserNtuples->FillGeneral(ev->simTrigger()->id().runNumber(),ev->simTrigger()->id().eventInRun());
float Ecalo=0.0; // for the total calorimetric energy
float Eecaltotal=0.0;
float Ehcaltotal=0.0;
HepPoint3D TowerPosition;
/* Loop over all CaloCluster objects */
while (MyCaloTower.next()) {
Ecalo += MyCaloTower->Energy(); // sum up the total energy
Eecaltotal+=MyCaloTower->EnergyEcalTower(); // sum up the Ecal energy
Ehcaltotal+=MyCaloTower->EnergyHcalTower(); // sum up the Hcal energy
TowerPosition = MyCaloTower->Position();
/* Print energy, azimuth and pseudo-rapidity of a cluster and fill this
* to our Ntuple */
cout.setf(ios::showpoint);
cout << "New Tower E(tot/Ecal/Hcal)=" << setw(8) << setprecision(3)
<< MyCaloTower->Energy() << "/"
<< MyCaloTower->EnergyEcalTower() << "/"
<< MyCaloTower->EnergyHcalTower() << " GeV"
<< "; phi=" << setw(8) << setprecision(4) << TowerPosition.phi() << " rad"
<< "; eta=" << TowerPosition.pseudoRapidity()
<< endl;
UserNtuples->AddTower(MyCaloTower->Energy(),TowerPosition.phi(),
TowerPosition.pseudoRapidity());
}
- List of Clusters
RecItr<CaloCluster> MyCluster(ev->recEvent(),"EcalFixedWindow_5x5");
// Just print the event number (see CARF/G3Event/interface/G3EventHeader.h)
cout << "===========================================================" << endl;
cout << "=== Private analysis of event #"<< ev->simTrigger()->id().eventInRun()
<< " in run #" << ev->simTrigger()->id().runNumber() << endl;
eventsAnalysed++; // some statistics: count events and runs processed
if (ev->simTrigger()->id().runNumber() != lastrun) {
lastrun = (unsigned int) ev->simTrigger()->id().runNumber();
runsAnalysed++;
}
// Here is the loop over all clusters
while (MyCluster.next()) {
nClusters++; // count the clusters
// Print some of the cluster properties
// see Calorimetry/CaloCommon/interface/CaloCluster.h
cout << "Cluster " << nClusters <<": E=" << MyCluster->Energy()
<< ", eta="<< MyCluster->Eta()
<< ", phi="<< MyCluster->Phi() << endl;
// Fill them to (our) histograms. Defined in ExClusterHistos.h
UserHists->FillEepCluster(MyCluster->Energy(), MyCluster->Eta(),
MyCluster->Phi());
}
The three variables ipu, MyCaloTower and MyCluster are iterators. Iterators in oo programming are used whenever we want to examine a list or container of objects. Once you have defined an iterator for the container of objects that interests you (events, tracks, vertices, clusters, etc..) then looping
through the objects becomes trivial:
while (iterator.next()){ }
in the last two cases. In the first case(list of pileup events) it is slightly more complex:
for(iterator=firstitem;iterator!=lastitem;iterator++){ }
In any case the iterator inside the loop points to the current object and
we can use it to get all information about the object.
From the code is apparent that ORCA has a general purpose iterator
named RecItr that can be used for all kind of RecObj objects. It is sufficient
to write RecItr<objectname> ip(ev->RecEvent())
and the variable ip will point to the first object of the type indicated
for the event indicated.This is a reconstructed object and a Reconstruction on Demand similar to the Action on Demand of the previous section is performed in this case in the following way:
if the requested RecObj is not present in the data base ,it is
computed on the fly by the "default" module used for the object. In the
case 3 you see that it is also possible to select a RecObj computed by
a module indicated by us (EcalFixedWindow_5x5). In this way we can
test new reconstruction algorithms.
What reconstructed objects are available
for our analysis apart from CaloCluster and EcalPlusHcalTower? Easy, you
have to look at the documentation for RecObj and there you
will find a clickable map of objects connected to it
(in jargon derived o inheriting objects). So CaloCluster is son
of RecObj. But also these derived objects may have derived classes
that are also RecObj,etc.. So EcalPlusHcalTower is son
of CaloCluster. As such you have access both to quantities in
CaloCluster like Energy and to quantities in EcalPlusHcalTower like EnergyHcalTower.
From the three examples is also apparent that an ActiveObserver
class gets a pointer to the current G3EventProxy as an argument. This
is the variable ev. Then ev->recEvent() points to the reconstructed
event information with all its RecObj, instead ev->simTrigger() can
be used to track informations concerning the simulated event like tracks,vertices and MCgenerator information. These are the containers ev->simTrigger()->tracks() vertices() and rawgenparts(). Instead ev->pu_range is a container of pileup_events for the current event.
A complete event, object by object.
You remember when we did an event dump to understand how various information
was stored?
At last, we know enough that we can try to get the equivalent with a oo data base, i.e.
recovering all persistent objects describing a single event. For example, we can take the first event processed in
the simple example analysis. This is an event of the
dataset eg_1gam_pt25 with owner jetNoPU_CERN of the federation
cmspf01::/cms/reconstruction/user/jet0900/jet0900.boot.
This is event 1 of Run 37 (as you can see from the output listing).
We will use ootoolmgr described previously to track all information
about the event in the federation:i.e. all persistent objects describing the
event.The list of all ORCA objects in the ORCA manual will be our guide:
Looking in the database catalog we find 12 files connected to this dataset
these are named
EVD0_Events.eg_1gam_pt25.jetNoPU_CERN
EVD0_Digis.eg_1gam_pt25.jetNoPU_CERN
EVD0_Collections.eg_1gam_pt25.jetNoPU_CERN
The other 9 have the same name but starting with EVD1 EVD2 EVD3.
These files,as you have to guess from Autumn 2000 production information, contain only the Digis objects created by running SimReader or
ooDigi. To complete the description of the event we must add the files created
by G3Reader or ooHits which have a different owner (jetHit120_2D_CERN)
and have the following names:
EVD0_Collections.eg_1gam_pt25.jetHit120_2D_CERN
EVD0_Events.eg_1gam_pt25.jetHit120_2D_CERN
EVD0_Hits.eg_1gam_pt25.jetHit120_2D_CERN
EVD0_MCInfo.eg_1gam_pt25.jetHit120_2D_CERN
EVD0_THits.eg_1gam_pt25.jetHit120_2D_CERN
also these repeated with EVD1,EVD2,EVD3 with a total of other 20 files.
This information is summarized in the production sheet with the following lines:
eg_1gam_pt25 jetHit120_2D_CERN Collections 0 1 2 3
eg_1gam_pt25 jetHit120_2D_CERN Events 0 1 2 3
eg_1gam_pt25 jetHit120_2D_CERN Hits 0 1 2 3
eg_1gam_pt25 jetHit120_2D_CERN MCInfo 0 1 2 3
eg_1gam_pt25 jetHit120_2D_CERN THits 0 1 2 3
eg_1gam_pt25 jetNoPU_CERN Collections 0 1 2 3
eg_1gam_pt25 jetNoPU_CERN Digis 0 1 2 3
eg_1gam_pt25 jetNoPU_CERN Events 0 1 2 3
So event 1 Run 37 must be tracked in these 32 Objectivity databases! The
first group of files contains the objects connected to SimTrigger;
the second those pointed from RecEvent.
Here starts the "dump" of event 37 1
- Run 37 produced by ooDigi
- SimDigiEvent for event 1 contains 11 RawData objects. This corresponds to RecEvent in Orca.
- MuEndWireDigi produced by Read Out Unit EDWDig00
- MuEndStripDigi produced by Read Out Unit EDSDig00
- MuBarDigi produced by Read Out Unit EDWDig00
- MRpcDigi produced by Read Out Unit EDWDig00
- CaloRecHit produced by Read Out Unit CREBRY01
- CaloRecHit produced by Read Out Unit CREFRY01
- CaloRecHit produced by Read Out Unit CRESFX01
- CaloRecHit produced by Read Out Unit HRHCAL01
- HcalTrigPrim produced by Read Out Unit HTHTOW01
- SimEvent for event 1 1(I wasn't able to find Run 37!) corresponds to SimTrigger in the Orca.
- SimEventBody with links to vertices, tracks and Montecarlo data. Other parts of the same object here and here.
- RawEvent with links to all Hits in the event.
- SimpleRawData the first group of hits coming from readout unit TUpto08.
Scrambled (Build)files
It is time to understand those intriguing little files. What follow are
the Buildfiles used in the Orca User manual examples:
- List of Run and event number
<environment>
<lib name=Workspace><lib>
<Group name=RecReader>
<External ref=COBRA Use=CARF>
<bin file=ExRunEvent.cpp>my favourite application</bin>
</environment>
This is the smallest set that will work. Note that all external libraries are
loaded using the BuildFile in COBRA/CARF.This BuildFile uses the tag RecReader to select the classes needed to read RecHits. SimHits are read by the libraries loaded with the tag SimReader.
The lines starting with lib and bin say to Scram that it should create a shared library libWorkspace.so and an
executable ExRunEvent. To find them just give the command
where ExRunEvent
then search in the bin directory given and in the nearby lib directory.
To find out which libraries has Scram loaded, you must look at the output that you get from the command scram build .
- List of Run and event number + getting Tracker layout information
<environment>
<lib name=Workspace><lib>
<Use name=Tracker>
<Group name=RecReader>
<External ref=COBRA Use=CARF>
<bin file=ExRunEvent.cpp>my favourite application</bin>
</environment>
We have only added a card saying to Scram that the BuildFile of the subsystem
Tracker must be used.
- List of towers
<environment>
<lib name=Tutorial>
<lib name=EcalPlusHcalTower>
<lib name=CaloCluster>
<Group name=CaloRecHitReader>
<Use name=Calorimetry>
<Group name=RecReader>
<External ref=COBRA Use=CARF>
<External ref=root>
<bin file=ExTutNtuple.cpp></bin>
</environment>
The main change here , is that we request the use of the BuilFile in the subsystem Calorimetry.This BuildFile will load many different sets of libraries and the set that we want is selected by the tag CaloRecHitReader.
Note that in addition to these libraries, we request explicitily the libraries
of subsystem Calorimetry EcalPlusHcalTower and CaloCluster.
Finally, to create the n-tuple, the external system root is requested.
- List of clusters
<External ref=cern>
<External ref=cmsim>
<External ref=HepODBMS>
<External ref=Objectivity>
<lib name=ExCalorimetry>
<environment>
<Group name=CaloHitReader>
<Group name=CaloRHitWriter>
<Group name=CaloRHitReader>
<lib name=EcalFixedWindow>
<lib name=CaloData>
<lib name=CaloCluster>
<Use name=Calorimetry>
<Group name=RecReader>
<Use name=CARF>
<Use name=Utilities>
<bin file=ExClusterHistograms.cpp></bin>
</Use>
</Use>
</Use>
</lib>
</environment>
<environment>
<Group name=CaloHitReader>
<Group name=CaloRHitWriter>
<Group name=CaloRHitReader>
<lib name=EcalFixedWindow>
<lib name=EcalDynamical>
<lib name=CaloData>
<lib name=CaloCluster>
<Use name=Calorimetry>
<Group name=RecReader>
<Use name=CARF>
<Use name=Utilities>
<bin file=ExCompClusterers.cpp></bin>
</Use>
</Use>
</Use>
</lib>
</environment>
</External>
</External>
</External>
</External>
So, first of all we must specify in a Build file what we want to build with
a tag <bin file=sourcefile></bin>
.
To build an executable scram must use
| external ref=productname | libraries,include files and other stuff connected to the external product
|
| lib name=libname | libraries found in the SCRAM search path.
|
| Use name=package | the package indicated.In fact this means a reference to the Buildfile of the package.
|
| Group name=groupname | The group will set a switch
that will control the loading of used packages. What really happens depends on
the BuildFile of the package.
|
The tag environment is used to group other commands.The first and the last commands in a buildfile must be <environment> and </environment>. If
these are missing, SCRAM will imply their presence. Any other environment tag
is used to separate different "environments" like in the example 3 where we
build two executables.
The interplay between the "Use" and the "Group" tag is difficult to understand
unless you have a look at the BuildFile of the "used" package, for example
CARF . What libraries the package will "export" to you depends on the "Groups" named G3NoMain,
G3Reader, SimReader, RecReader. Note the tag "export" used to define the
interface to the package.Understanding what are the groups to use for each
package should (hopefully) be documented in the ORCA manual .
How you decide wich packages to use? For this you must look in your
source and see where the include files come from. Unfortunately this isn't
very easy so ,it is better to proceed with another example.
Another illuminating example
scram project ORCA ORCA_4_5_0
cd ORCA_4_5_0/src
eval `scram runtime -csh`
cmscvsroot ORCA
cvs co Examples/CompGenRec
cd Examples/CompGenRec
scram b
setenv OO_FD_BOOT cmspf01::/cms/reconstruction/user/jet0900/jet0900.boot
setenv CARF_INPUT_OWNER jetNoPU_CERN
setenv CARF_INPUT_DATASET_NAME eg_1gam_pt25
rehash
ExCompMuon
Let's look at the result:
This example is interesting since the program will access generated muons, calorimeter clusters, tracker tracks and muon detector tracks through four
methods(subroutines) called getGeneratorParticles, getCalorimeterClusters, getTrackerTracks, getMuonTracks. As you can see from the code the
access to each kind of information is far for simple and the BuildFile itself
is also complex. No wonder that we can't get a simple program filling some
quantity in a histogram running:the navigation to access any single
piece of information that before was as easy as counting 1,2,.. in a bank,
now is a complete nightmare.
DetUnit: another piece of the puzzle
This is another important piece of the puzzle:this concerns the second layer of
complexity i.e. the object model of CMS.Before introducing this new object
let's see again the phases of reconstruction. We have 3 phases and each phase produces objects that
can be in part persistent. Everything starts with SimHits stored in DB. Then we
have Digits that must be equal to what we get with real data.Note that some
Digits wan't be stored in DB since they will be immediately processed and
transformed in RecHits stored in DB. Viceversa some RecHits will be only
virtual objects, since will be computed on the fly when needed.
At the end of the reconstruction we have the RecObj(tracks and clusters) stored
in DB.
Now let's go back to DetUnit and the detector.Each module of the
detector is represented in the software by a DetUnit object.Let's look first at a pictorial
representation of the various objects used(the so called class diagram) .So we have
thousands of DetUnits objects modelling the real detector.Every DetUnit has
pointers to other objects which contain the geometrical information of the
module(absolute position, orientation, etc...). From this point of view this is
similar to what we had in the past. The novelty is that now you can access also
event data through DetUnit.All information associated with a detector
module like Digis, simulated (Geant) hits and reconstructed hits(cluster) are accessed through the corresponding DetUnit. In the case
of simulated hits SimHits this happens through an object SimDet:
i.e. DetUnit points to SimDet that has a method returning a vector of
SimHit pointers. RecHits are instead created on the spot by the corresponding
DetUnit.Digis are provided by DetUnit through another object named Readout.
The Readout object can be viewed as a container of Digis corresponding to a
single DetUnit. In the Orca output you see frequently the sentence:
creating a ROU Slave Factory for:
followed by some acronym. This is connected to the mechanism seen before.
The first two letters in the acronym indicate wich kind of subdetector
ORCA is handling:
- Tracker
- Calorimetry
- Muon
- BD - Barrel Drift Chambers
- ED - End Cap Cathode Strip Chambers
- RD - Rpc detector
Raw data physical clustering
Let's think about Objectivity federation
as a physical "container" composed from files("databases") segmented in containers. The problem of the people writing the CMS software was:
- Decide which objects should be stored and which not.
- If an object is not stored ensure that enough information is stored that
we can recreate it from what we have in the database.
- Decide what happens if we recalculate the same objects:we update the
previous objects or we create a copy (cloning in ooslang).
- Try to put toghether objects that are mostly processed toghether
- Decide what name to give to each database and container (not an easy
thing when you have thousands of them)
- Decide where to store the single object.
- Last but not least, try to do this in such a way that can be implemented
also with other databases(if CMS decides to change from Objectivity).
As you see, not an easy task. From the production page the main
strategy used can be seen. In this page each number corresponds to a
database: so this is a schematic representation of the federation.
The databases are grouped in dataset. Each dataset is divided in two
or more sections belonging to different owner. Note that a single owner
encompasses many datasets. For each section you have some lines named "Events"
and "Collections" which are repeated. This shows the strategy used of cloning
events and collections of events(runs) instead of updating a single object.
The first event object is created by ooHit, the others by running ooDigi
with different parameters.The other databases (except for MCInfo) contain
SimHits produced by Montecarlo and Digis(i.e. Digits and RecHits that should be equal to raw data
objects). Examining these databases you can see that the names of containers refer to pieces of the detector. This is because it was decided that raw data belonging to the same sub-detector are clustered toghether.
Developer!
Hey! I have become a CMS developer! I feel like a mediaeval knight after the
investiture. How you become a developer? Only another knight (pardon developer) can give you the title by adding your name to the list of developers responsible for some module in the repository. After that you are able to commit your
changes to the repository.
Now I feel a big responsibility: how can my brain damaged by decades of
Fortran programming cope with the new generation of oo knights?
The code in the previous sections gives an idea on how my C++ programming
is Fortran-like. I have really to start studying these 3 documents like
the bible:
Reading these guidelines, it is clear that I have to give up using my beloved
Fortran arrays and to start using STL containers and iterators.
Changes,changes and more changes!
The good news about the programmer's life is that you never get bored. The bad news is that everything keeps changing and CMS software is no exception.
This picture gives a vivid idea on how releases follow releases and everything
changes.
Let's start with a few new acronyms:
DDD - Detector Description Database is a database describing (guess it!) the detector on ascii files containing XML tags plus a C++ Api to access
this description. Fortunately for us the final users can (probably) safely
ignore all the technicalities.
LCG - LHC Computing Grid project. A group of people working (among other things) on the replacement of Objectivity with Root.
Root - A Cern software by the same people that gave us Paw and Zebra. Root
is intended in fact as a OO replacement for these successfull packages.
Root will provide also the object persistency replacing Objectivity.
In principle COBRA will hide end-users like us from the complexities of Root.
But ,judging from the Root home page it is
also possible that,in the future, it will bring at last an easy interface
Paw-like for those that don't know oop.
POOL - Pool Of persistent Objects for LHC is the software provided by
LCG that will take care of persistency of objects(in slang the persistency framework) this
is the name of the root replacement of Objectivity.
For what concerns this document the most important change was the release
of Iguana 4 with the new plugin architecture.
This means ,said in plain words, that the objects like TwigMuBarRpcSimHits
that we used in the previous paragraphs don't anymore exist:i.e.
catastrophe! We have to restart from scratch!
ORCA_7.0 brings all these new things and it is interesting to compare
the list of used packages with the same list for the
previous release.
An inspection of the CVS repository for Visualization/MuonVis shows that the
objects used in the previous paragraph have been thrown in the Attic but they
seems to be replaced by the following new objects:
- TwigMubarRpcSimHits by VisMuBarRpcSimHitsTwig
- Init.cc by plugin.cc
But first things first, let's try to run Iguana! After many hours of browsing the code and the documentation , in the Release note of ORCA_6_3_0 I find
the magic words:
To start the visualisation, type "iguana" and select "COBRA". To display an object, first select the object and then click on the visualisaton box next to it.
Obvious? Isn't it!
So these are the steps to run the iguana plugin with ORCA_6_3_0 (this is the first release the plugin architecture was introduced in Orca)
- cd /afs/cern.ch/cms/Releases/ORCA/ORCA_6_3_0/src
- eval `scram runtime -csh`
- cd
- source testfed.csh
- iguana
Now we try to add a branch "Event/Muon/Barrel/RpcMyHits" by cloning the branch
"Event/Muon/Barrel/RpcSimHits"
The complete procedure is:
- cd your local area
- project ORCA
- scram project ORCA ORCA_6_3_0
- cd ORCA_6_3_0/src
- cvs co -r ORCA_6_3_0 Visualisation
- cd Visualisation
- scram build
- iguana (it should work like before but using the code compiled in our area)
- cp MuonVis/src/VisMuBarRpcSimHitsTwig.cc MuonVis/src/VisMuBarRpcMyHitsTwig.cc
- cp MuonVis/interface/VisMuBarRpcSimHitsTwig.h MuonVis/interface/VisMuBarRpcMyHitsTwig.h
- modify MuonVis/src/VisMuBarRpcMyHitsTwig.cc
- modify MuonVis/interface/VisMuBarRpcMyHitsTwig.h
- modify MuonVis/interface/xtypeinfo.h
- modify MuonVis/src/plugin.cc
- modify MuonVis/src/VisMuDataProxy.cc
- scram build
- iguana (cross your finger!):result file
Hey , we are still in business!
These are instead the steps to create a new branch in the Event subtree.
- modify CobraVisMain/src/CobraVisMain.cc adding:
m_document->addContentProxy ("COBRA/Event/CustomTracker");
- Create a bunch of classes
Objectivity out, enters ROOT!
Let's enter in a brave new world : Root, Grid, RH7.3, LCG, etc ...
Let's try ORCA_7_1_1 !
- cd /afs/cern.ch/cms/Releases/ORCA/ORCA_7_1_1/src
- eval `scram runtime -csh`
- cd
- cp orcarc .orcarc
- iguana
The result is the following error message:
Xlib: extension "GLX" missing on display "lxplus080:23.0".
Inventor error in SoQtGLWidget::SoQtGLWidget(): OpenGL not available!
The brave new world has to wait...
Second try a few days later
- cd orca
- project ORCA
- scram project ORCA ORCA_7_2_0_pre13
- cd ORCA_7_2_0_pre13/src
- cvs co -r ORCA_7_2_0_pre13 Visualisation
- cd Visualisation
- setenv SCRAM_ARCH Linux__2.4/gcc3
- scram build
- cp ~/orcarc .orcarc
- eval `scram runtime -csh`
- iguana
The result is still:
Xlib: extension "GLX" missing on display "lxplus010:15.0".
But now I know that the problem is with my computer X server that lacks this
Opengl extension.
Third try a month later, after I have installed XFree86 Version 4.1.0 and solved a problem of test data samples inaccessibility with the help of Werner Jank
- cd orca
- project ORCA
- scram project ORCA ORCA_7_2_1
- cd ORCA_7_2_1/src
- cvs co -r ORCA_7_2_1 Visualisation
- cd Visualisation
- scram build
- cp ~/orcarc .orcarc
- eval `scram runtime -csh`
- iguana
It works again!
Not obvious with all the changes done. The most important is that there
is no Objectivity data base. Now you can get a list of the data files with
a simple:
rfdir suncmsc:/data/valid/ORCA_7_2_0
Iguana (that now uses the Coin 3D implementation of OpenInventor) is again
almost unusable. We are two years back with many visualizations no more
working and continuous crashes.
Fourth try with the next version of Orca containing Iguana
- cd orca
- project ORCA
- scram project ORCA ORCA_7_3_0
- cd ORCA_7_3_0/src
- cvs co -r ORCA_7_3_0 Visualisation
- cd Visualisation
- scram build
- cp ~/orcarc .orcarc
- eval `scram runtime -csh`
- env LOG=stderr iguana
note where iguana freezes and remove the .reg file with following command:
- rm /afs/cern.ch/user/g/gzito/orca/ORCA_7_3_0/lib/Linux__2.4/iguana-plugins/MuonVis.reg
- env LOG=stderr iguana
(now is OK) but the next time I run Iguana after changing something in the
plugins, I must remove the plugin cache with following command:
- rm /afs/cern.ch/user/g/gzito/orca/ORCA_7_3_0/lib/Linux__2.4/iguana-plugins/.cache
Some not very obvious things to do.(This version has also a change in the name
of plugin: this is no more named "COBRA" but "ORCA").
Running ORCA_7_4_0
Orca 7_4_0 doesn't have Iguana but we try to run a modified ExRunEventInfo.cc
in Workspace in order to get out the tracker data.
- cd orca
- project ORCA
- scram project ORCA ORCA_7_4_0
- cd ORCA_7_4_0/src
- cvs co -r ORCA_7_4_0 Workspace
- cd Workspace
- scram b shared
- scram b bin
- eval `scram runtime -csh`
- cp ~/orcarc .orcarc
- ExRunEvent
Here the modified Buildfile, ExRunEventInfo.cc and two other files added in the same directory:
CuTkBuilderInORCA.cc,CuTkBuilderInORCA.h
Note in the .orcarc file that for the first time the informations about datasets are taken from this catalog in Xml.
This is the first ORCA release that uses POOL for persistency.In fact a look at the list of used packages reveals a host of new acronyms.
- PI -
- POOL - tool to store information.Provides objects persistency.This
is based on a dataset catalog.There are three different implementations of this Catalog: a simple one (user level) with a file of xml cards; a multiuser one
based on mysql and the Grid-aware implementation that will be compatible with the EDG(European Data Grid) software.
- SEAL -
These are LCG Applications shared
among LHC experiments.
Running ORCA_7_5_0
We proceed as for ORCA_7_4_0
- cd orca
- project ORCA
- scram project ORCA ORCA_7_5_0
- cd ORCA_7_5_0/src
- cvs co -r Tutorial031114a Workspace
- cd Workspace
- scram b shared
- scram b bin
- eval `scram runtime -csh`
- cp ~/orcarc .orcarc
- ExRunEvent
Running IGUANACMS 1.1.0 with ORCA_7_5_0
The examinations never end (this is the title of a Eduardo De Filippo play)!
We have to switch from IGUANA to IGUANACMS.
New site, new manual.
All the code in ORCA/Visualisation is obsolete.
Let's start again from scratch!
- cd orca
- project IGUANACMS
- scram project IGUANACMS IGUANACMS_1_1_0
- cd IGUANACMS_1_1_0
- cvs co -d src -r IGUANACMS_1_1_0 IGUANACMS
- cd src
- scram build
- eval `scram runtime -csh`
- cd VisOrca/VisOrcaMain/test
- cp ~/orcarc .orcarc
- iguana --list
Now we try to build a new module "VisCustomTracker"
- cd IGUANACMS_1_1_0/src/VisOrca
- cp -r VisTracker VisCustomTracker
- modify VisOrcaMain/src/VisOrcaMain.cc adding the following line:
m_document->addContentProxy ("ORCA/Event/CustomTracker");
- in VisCustomTracker rename the following files:
mv src/VisTkEventContent.cc src/VisCuTkEventContent.cc
mv src/VisTkTwig.cc src/VisCuTkTwig.cc
mv src/VisTkSimHitsTwig.cc src/VisCuTkSimHitsTwig.cc
mv interface/VisTkEventContent.h interface/VisCuTkEventContent.h
mv interface/VisTkTwig.h interface/VisCuTkTwig.h
mv interface/VisTkSimHitsTwig.h interface/VisCuTkSimHitsTwig.h
- Now modify these 8 files:
.src/VisCuTkEventContent.cc , src/VisCuTkTwig.cc , src/VisCuTkSimHitsTwig.cc ,
src/plugin.cc , interface/VisCuTkEventContent.h , interface/VisCuTkTwig.h , interface/VisCuTkSimHitsTwig.h , interface/xtypeinfo.h
- scram b
- cd ..
- iguana
This is the result
Running IGUANACMS 1.3.0 with ORCA_7_5_2
This new version of Iguanacms contains also the CustomTracker plugin.
- cd orca
- project IGUANACMS
- scram project IGUANACMS IGUANACMS_1_3_0
- cd IGUANACMS_1_3_0/src
- cvs co -r IGUANACMS_1_3_0 VisOrca
- cd VisCustomTracker
- scram build
- eval `scram runtime -csh`
- cd ../VisOrca/VisOrcaMain/test
- iguana
Running IGUANACMS 1.7.0 with ORCA_8_1_1 and COBRA_7_8_1
- cd orca
- project IGUANACMS
- scram project IGUANACMS IGUANACMS_1_7_0_pre1
- cd IGUANACMS_1_7_0_pre1/src
- cvs co VisOrca/VisCustomTracker
- cvs co VisOrca/VisOrcaMain
- cd VisOrca/VisCustomTracker
- scram build
- eval `scram runtime -csh`
- cd VisOrcaMain/test
- vi .orcarc (to add :ORCA/Event/CustomTracker)
- iguana
Running IGUANACMS 1.10.0 based on IGUANA_5_1_1 with ORCA_8_3_0,OSCAR 3.3.1 and COBRA_7_8_6
- cd orca
- project IGUANACMS
- scram project IGUANACMS IGUANACMS_1_10_0
- cd IGUANACMS_1_10_0/src
- cvs co VisOrca/VisCustomTracker
- cd VisOrca/
- scram build
- eval `scram runtime -csh`
- cp ~/orca/IGUANACMS_1_9_1/src/VisOrca/VisOrcaMain/test/.orcarc orcarc
- iguana -c orcarc
Explore your Federation(2)!
Objectivity is out, but we still have Federations.They are now implemented using LCG software.This means that you can have your federation on your computer
but also on the Grid.
Now we will do again this exercise of exploring our federation in order to discover what is changed.The good things are:
- The software is developed directly by Cern and freely available
in the context of LHC Computing Grid (LCG) Project. It is the same for all
LHC experiments.It is documented here with a Beginner's Workbook.
- The federation structure has been kept with the same bunch of files containing
the data plus a Catalog that describes the federation(corresponding to the
Objectivity bootfile).We still have
datasets and containers.
- There is FCBrowser.py that replaces in part
ootoolmgr.It allows you to explore only the catalog but not the data(the single objects in the container).For these we have to use
CMS sample programs.
Are you ready for the exploration? First you have to get the
catalog name of the federation. For example :
http://cmsdoc.cern.ch/orca/catalog/PoolFileCatalog_7_5_0.xml.I have a text copy here in case your browser doesn't see xml.
This catalog describes the federation contained in /castor/cern.ch/cms/reconstruction/datafiles/ORCA_7_5_0/ and composed by 341 files.
In order to use FCBrowser.py we have to:
- setenv PATH /afs/cern.ch/sw/lcg/app/spi/scram:$PATH
- setenv SCRAM_ARCH rh73_gcc32
- setenv CVSROOT :pserver:anoncvs@lcgapp.cern.ch:/cvs/POOL
- cvs login
- type password
cvs
- mkdir ~/mypool
- cd mypool
- scram project POOL POOL_1_4_0
- cd ~/mypool/POOL_1_4_0/src
- cvs co -r POOL_1_4_0 Examples
- cd Examples/
- cd SimpleWriter
- scram b
- eval `scram runtime -csh`
- rehash
- SimpleWriter
- cd ../../
- cp Examples/SimpleWriter/pool.env .
- Modify pool.env replacing xmlcatalog_file:FileCatalog.xml with
xmlcatalog_http://cmsdoc.cern.ch/orca/catalog/PoolFileCatalog_7_5_0.xml
- eval `scram runtime -csh pool.env`
- FCBrowser.py
Playing with this you discover that each file has a LFN Logical File Name and a PFN Physical File Name.There are also for each file 7 "metadata" attributes: jobid , dataset, DataType, FileCategory, runid, DBoid, owner.
Note that each "file" corresponds to the Objectivity "database".
(Real) Developers see (design) patterns everywhere!
As I enter more in the intricacies of the CMS object model, I am starting
to understand some strange slang real OO developers use when they refer
to the objects implemented.For example Observer, Lazy Observer, Dispatcher, Builder, Proxy, Singleton,
Adapter, Facade, Factory ... These are all Design Patterns: more or
less programming recipes to solve well known problems that appear frequently
in OO programming.We have already in this document used this pattern
jargon when we spoke about Action on demand , containers and iterators.
The idea is good: don't reinvent the wheel and use some clever ways already
tested. The only problem is that it is very difficult for a newbie to OO programming to realize that his problem can be solved by the Singleton pattern nicely provided
by other CMS developers in this directory.
For a funny discussion about the strange way oo programmers solve problems,
look at this discussion about Why I hate frameworks.
Running the CMS Helloworld on one hundred computers all around the world
[Added February 2005] This is getting like a blog, so from now on I'll add
the date the item was written.
By now you should know what is the CMS "Helloworld": it is the famous
ExRunEvent in module Workspace.
This program doesn't do nothing useful but reading input events and giving
a summary of how many they were.In principle you can use it as a template
to write more interesting programs but this is not straightforward
since you must know the CMS data model! Anyhow we would like to run this
program on the Grid. But before doing this we review briefly how to run it on your
desktop.
To run it in local , you must have access to some datasets accessible with
rfio. In .orcarc you specify the Pool catalogue for the dataset:
InputFileCatalogURL = @{xmlcatalog_http://webcms.ba.infn.it/cms-software/orca/hg03_hzz_2e2mu_130a_rfio.xml}@
and the Input Collection
InputCollections = /System/hg_2x1033PU761_TkMu_g133_CMS/hg03_hzz_2e2mu_130a/hg03_hzz_2e2mu_130a
Now we run ExRunEvent on the Grid:
The following steps are very well documented and are needed to be done only
once.
Now you can give the command (always on the UI):
grid-proxy-init
that will enable you to use the grid for 12 hours.
Now you can try all the commands described in the LCG-2 User Guide.
For example executing some command on another Grid computer:
globus-job-run gridba2.ba.infn.it /bin/hostname
To run our Helloworld on the Grid we must write a file in a language called jdl (job description language) and specify which program we must run , the input
files, the output files, etc.. and then send this file to the Grid. Not
easy for a newby. So I'll use the tool CRAB that will do everything for you. First of all you download CRAB using CVS.
- cmscvsroot CRAB
- setenv CVSROOT :pserver:anonymous@cmscvs.cern.ch:/cvs_server/repositories/CRAB
- cvs login
- cvs co CRAB/UserTools
Then you have to modify the CRAB file crab.cfg
- cd CRAB
- cd UserTools/src/
- vi crab.cfg
The program needs only the name "ExRunEvent" but you must have before done
a eval `scram runtime -csh`
in the correct directory (i.e. the program will use the environment variables
set by SCRAM to get everything).
The program needs also a copy of ".orcarc"
To create and send your first 2 jobs to the Grid now you write:
./crab.py -bunch_creation 2 -bunch_submission 2
To check job status:
edg-job-status -i Jobs/log/submission_id.log
Let's start again from EDM!
Up to now changes in CMS software have been relatively small. The change
from Objectivity to POOL federations has been gradual and in some way
transparent to an end user like me , because I was always using COBRA
the CMS framework. But now that's a BIG change! Let's rewrite the
CMS framework. Forget COBRA, start using EDM! Well not really, since
,if we drop Cobra now, we have to stop everything. Let's say, for sometime
(hopefully a few months, but who knows?) we have to provide two versions of each CMS application,
one for Cobra, the other for EDM. Wow, that's something! It's interesting
to know the reasons for such rewriting. I'll paste some citations:
-
Poor data management: single event data split across 12 files
-
Poor metadata organization (related to the above)
-
Incomplete functionality: several key use cases not supported
The change has also been marked by a new CMS software web page.
New acronyms:CMSSW contains the new software.
Here you can find a nice DataFormats directory with a description of raw data.
Raw data are produced by FEDs,packed in standard headers/trailers, then collected in the Builder Units (BU) and made available in the Filter Units(FE).You can think about FED,BU and FE as specialized online computers organized in clusters.
The FED raw data is not convenient for direct use. Some processing is necessary
before it can be used: the result of these operations is the Digi. At the level
of the Digi it is not the FED that is recorded as source of the raw data but
the module.A module is a detector unit described by a DetUnit and identified
by a detector identifier DetUnitId. All these features are provided in the new
framework through a Geometry service. Let's try it:
CMSSW_0_0_1_pre9
The result is a Ascii dump of tracker modules geometric quantities.
Note that the C++ code defines a SEAL plugin(?!?) and you run this
plugin by using the file "runP.txt" where you define the environment in which
the plugin should run.Geometry is now an EventSetup object (?!). No more
singletons with observers in the new framework!DetUnit has been replaced by GeomDetUnit.
CMSSW_0_2_0
- cd cmssw
- scramv1 project CMSSW CMSSW_0_2_0
- cd CMSSW_0_2_0/src
-
cvs co -r CMSSW_0_2_0 Geometry/CommonDetUnit
cvs co -r CMSSW_0_2_0 Geometry/TrackerSimAlgo
- eval `scramv1 runtime -csh`
- scramv1 b
- cd Geometry/TrackerSimAlgo/test
- Now I create a text file runP.txt
process GeometryTest = {
# empty input service, fire 10 events
source = EmptySource {untracked int32 maxEvents = 2}
es_source = XMLIdealGeometryESSource {string
GeometryConfiguration="testConfiguration.xml"}
es_module = TrackerGeometricDetESModule {}
es_module = TrackerDigiGeometryESModule {}
module print = AsciiOutputModule {}
# module prod = TrackerDigiGeometryAnalyzer {}
module prod = TrackerDigiGeometryAnalyzer {}
#provide a scheduler path
path p1 = {prod}
}
- scramv1 b
- cmsRun --parameter-set runP.txt
- The program TrackerDigiGeometryAnalyzer executes producing some output
- Now I copy TrackerMap.cc TmModule.cc to TrackerSimAlgo\src TrackerMap.h TmModule.h to TrackerSimAlgo\interface and
TrackerDigiGeometryAnalyzer.cc trackermap.txt to TrackerSimAlgo\test. You can find here the six files put toghether.
- scramv1 b
- cmsRun --parameter-set runP.txt
- The result is a file svgmap.svg
CMSSW_0_2_0 Visualisation
Some documentation is available from here and
from VisDocumentation/VisManual.
- (on lxcmsd1)
cd /localscratch/g/gzito
- scramv1 project CMSSW CMSSW_0_2_0
- cd CMSSW_0_2_0/src
- eval scramv1 runtime -csh
- cmscvsroot CMSSW
- cvs co -r CMSSW_0_2_0 VisReco/VisCustomTracker
- cvs co -r CMSSW_0_2_0 VisReco/VisRecoApp
- cp /afs/cern.ch/cms/Releases/ CMSSW/CMSSW_0_2_0/src/VisDocumentation/VisTutorial/ * .
- scramv1 b
- iguana --parameter-set recoGeometry.cfg
- Select CMSSW Reco from "Iguana Setup" window
- Select Next Event and then a part of the tracker
- (this was the easy part:we have checked that VisReco/VisTracker works: now we must replace VisTracker with VisCustomTracker and redo the check again)
- Replace in VisReco/VisRecoApp/src/VisRecoMain.cc "Reco/Tracker" with "Reco/CustomTracker"
- Copy VisReco/VisTracker/src/plugin.cc VisTkGeometryTwig.cc VisTkRecoContent.cc VisTracker.cc
and VisReco/VisTracker/interface/VisTkGeometryTwig.h VisTkRecoContent.h VisTracker.h xtypeinfo.h
in VisReco/CustomTracker/src and VisReco/CustomTracker/interface merging the mods done already in CMSSW_0_1_0
You can see all the mods done.
- scramv1 b
- iguana --parameter-set recoGeometry.cfg
- I get some ASSERT printout to which I have to answer Ignore. Then the program crashes and you restart again
and now it works
CMSSW_0_3_0_pre4 Visualisation(IGUANA_6_4_0)
At least: the CMS workbook is coming!
After many years of marching in the dark,a year before data taking starts,
when the CMS software is slowly rewritten inside CMSSW, the light at the
end of the tunnel! The CMS Workbook From its Introduction:
- ... a first stop for new users.
- ...learn some of the background information about the software
- ...useful introduction for new users
My hope is that it will make this guide obsolete.
Page author Giuseppe Zito: zito@ba.infn.it
Last update: