Purdue CMS Tier-2 Center
Home » User Information » Managing_datasets_via_Rucio

Managing datasets via Rucio

Setting up Rucio environment

$ source /cvmfs/cms.cern.ch/cmsset_default.sh
$ export X509_USER_PROXY=~/x509up_u`id -u`
$ voms-proxy-init -voms cms -valid 168:00
$ source /cvmfs/cms.cern.ch/rucio/setup-py3.sh
$ export RUCIO_ACCOUNT=piperov
$ rucio whoami #check that it all worked fine

 

Data placement/copy in Rucio is controlled via ‘rules’ 

(see official documentation at CERN)

Misc. Rucio commands

$ rucio list-rules --account $RUCIO_ACCOUNT
$ rucio rule-info [RULE_HASH]
$ rucio list-account-limits $RUCIO_ACCOUNT

 

To copy a dataset, you need to create a new rule

$ rucio add-rule cms:/CMS/DATA/SET/NAME 1 T2_US_Purdue

or:

$ rucio add-rule cms:/CMS/DATA/SET/NAME#BLOCK-NAME 1 T2_US_Purdue

or even better:

$ rucio add-rule --ask-approval --lifetime 2592000 cms:/CMS/DATA/SET/NAME#BLOCK-NAME 1 T2_US_Purdue



Managing groups of datasets via containers

NOTE: Rucio’s concept of ‘dataset’ is what CMS historically calls a ‘block’. What CMS normally calls a ‘dataset’ is called ‘container’ in Rucio!

Basic idea: Create one container for all datasets needed for your analysis, and then manage its copying to a site via single rule (instead of individual rules for each dataset).

$ rucio add-container user.piperov:/Analyses/Hmumu2020/USER #create the container
$ rucio attach user.piperov:/Analyses/Hmumu2020/USER cms:/SingleMuon/Run2018A-02Apr2020-v1/NANOAOD cms:/SingleMuon/Run2018B-02Apr2020-v1/NANOAOD #add two datasets to the container
$ rucio add-rule user.piperov:/Analyses/Hmumu2020/USER 1 T2_US_Purdue #create a rule to copy to Purdue
$ rucio attach user.piperov:/Analyses/Hmumu2020/USER cms:/SingleMuon/Run2017B-02Apr2020-v1/NANOAOD #add one more DS
$ rucio detach user.piperov:/Analyses/Hmumu2020/USER cms:/SingleMuon/Run2018A-02Apr2020-v1/NANOAOD #and remove one

 

$ rucio list-content user.piperov:/Analyses/Hmumu2020/USER #list contents of your container
+-----------------------------------------------+--------------+
| SCOPE:NAME | [DID TYPE] |
|-----------------------------------------------+--------------|
| cms:/SingleMuon/Run2016B-02Apr2020-v1/NANOAOD | CONTAINER |
...
+-----------------------------------------------+--------------+

 

$ rucio list-dids --filter type=CONTAINER user.piperov:* #list all your containers
+-----------------------------------------------------+--------------+
| SCOPE:NAME | [DID TYPE] |
|-----------------------------------------------------+--------------|
| user.piperov:/Analyses/Tests/USER | CONTAINER |
+-----------------------------------------------------+--------------+

 

$ rucio delete-rule [RULE_HASH] #delete the container from a site. (RULE_HASH from the add-rule command earlier)
$ rucio erase user.piperov:/Analyses/Hmumu2020/USER #delete container permanently from all sites. Note: this is final!

 

 

CPU Utilization

Raw Storage Use

Raw Storage Use