Purdue CMS Tier-2 Center
Home » User Information » Tools and FAQ » cmsSync Tool

cmsSync Tool

cmsSync tool is available on phedex:/usr/bin/cmsSync.

Running the client 

The client does the following things:
  • Gets a list of all blocks registered for your SE.
  • Retrieve your site's TFC.
  • Builds a list of file names from these blocks (long process).
  • Spiders a given base directory (long process).
  • Compares the contents of /mnt/hadoop and the list of files which should be at your site.

This requires PNFS to be mounted at your site and for you to know the SE name you use with PhEDEx.
The command line usage goes like this:

cmsSync --se=srm.rcac.purdue.edu  /mnt/hadoop/store 

Output

The client writes several files (as documented on the output of the utility). They are:

  • not_lfns.txt: A list of all files in the base directory which are not in the CMS namespace at all
  • user_lfns.txt: A list of all user files at your site.
  • registered_lfns.txt: All LFNs which should be at your site
  • blocks.txt: All blocks which should be at your site.
  • missing_lfns.txt: A list of all files which are registered to be at your site, but are not in dCache
  • extra_lfns.txt: A list of all files which are at your site, but not registered in PhEDEx. 

What to do with the output

  • The missing_lfns.txt can be attached to a Savannah ticket. Ask the dataops folks to remove the replicas at your site, but not the subscriptions. This means any datasets your site is still subscribed to are re-downloaded.
  • The extra_lfns.txt should be examined carefully. Delete any files which are not in one of these categories:
    • Have not been recently transferred (the synchronization is not immediate, meaning any in-transit files from when you ran your script might be falsely marked as extra).
    • Are not unmerged files.
    • Are not load test files.

The extra files are basically a list of all files which are located at your site, but will not be grid-accessible. It is often easier to delete (or re-subscribe to the dataset and not delete) than it is to hand-register the files in the global DBS.

CPU Utilization

Raw Storage Use